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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

10 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

1 5 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, fes* 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
30 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotidemolecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
35 polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 

sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 

hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 

The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 

5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designatedas SEQ ID NO: 1-1786 and 3573-5358. The polypeptides sequences are 

designated SEQ ID NO: 2n (wherein n = 1 to 20). The nucleic acids and polypeptides are provided 

in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenosine; C is 

cytosine; G is guanine; T is thymine; and N is any of the four bases. In the amino acids provided in 

1 0 the Sequence Listing, * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1-1 786 and 3573-5358 under stringent hybridization 
conditions; nucleic acid sequences which are allelic variants or species homologues of any of the 
nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a 

1 5 specific domain or truncation of the peptides encoded by SEQ ID NO:l-1786 and 3573-5358 . A 
polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying 
sequence of SEQ ID NO:l-1786 and 3573-5358 or a degenerate variant or fragment thereof The 
identifying sequence can be 1 00 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence inforrnation 

20 from the nucleic acid sequences of SEQ ID NO: 1 -1 786 and 3 573-5358 . The sequence information 
can be a segment of any one of SEQ ID NO:M 786 and 3573-5358 that uniquely identifies or 
, represents the sequence information of SEQ ID NO:l-1786 and 3573-5358. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 

25 a nucleic acid array. In one embodiment, segments of sequence information is provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readableformat. 

This invention also includes the reverse or direct complement of any of the nucleic acid 

30 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readable media, use in sequencing 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RN A, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1-1786 and 3573- 
5358 or novel segments or parts of the nucleic acids of the invention are used as primers in 
expression assays that are well known in the art In a particularly preferred embodiment, the nucleic 
acid sequences of SEQ ID NO: 1 -1 786 and 3573-5358 or novel segments or parts of the nucleic 
acids provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrath et al., Science 258 :52-59 (1 992), as expressed sequence tags for 
physical mapping of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO:l-1786 and 
3573-5358; a polynucleotide comprising any of the ftdl length protein coding sequences of SEQ ID 
NO:l-1786 and 3573-5358; and a polynucleotide comprising any of the nucleotide sequences of the 
mature protein coding sequences of SEQ ID NO: 1 -1 786 and 3573-5358. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set 
forth in SEQ ID NO: 1-1786 and 3573-5358; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 
(e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 
polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 
amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding 
full length or mature protein. Polypeptides of the invention also include polypeptides with biological 
activity that are encoded by (a) any of the polynucleotideshaving a nucleotide sequence set forth in 
SEQ ID NO:l-1786 and 3573-5358; or (b) polynucleotides that hybridize to the complement of the 
polynucleotides of (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 
equivalents" thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) of the 
invention. 
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The invention also provides compositions comprising a polypeptide of the invention. 

Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 

hydrophilic, e.g., phaimaceutically acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 

the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 
protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PGR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 
using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
expression or biological activity. 



WO 01/53312 PCT/USOO/34263 
The present invention further relates to methods for detecting the presence of the 

polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 

utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 

identification of subjects exhibiting a predisposition to such conditions. The invention provides 

5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 

the sample with a compound that binds to and forms a complex with the polynucleotide of 

interest for a period sufficient to form the complex and under conditions sufficient to form a 

complex and detecting the complex such that if a complex is detected, the polynucleotide of 

interest is detected. The invention also provides a method for detecting the polypeptides of the 

invention in a sample comprising contacting the sample with a compound that binds to and forms 

a complex with the polypeptide under conditions and for a period sufficient to form the complex 

and detecting the formation of the complex such that if a complex is formed, the polypeptide is 

detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
(i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g. , 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
modulate the overall activity of the target gene products. Compounds and other substances can 
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effect such modulation either on the level of target gene/protein expression or target protein 
activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Table 2); for which they have a 
signature region (as set forth in Table 3); or for which they have homology to a gene family (as 
set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides and 
polynucleotides of the present invention are useful for a variety of applications, as described 
herein, including use in arrays for detection. 

4. DETAILED DESCRIPTION OF THE INVENTION 
4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
"an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bin^d with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5 '-AGT-3 ' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
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and continuous source of germ cells for the production of gametes. The term primordial germ 

cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 

from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 

differentiate into germ cells and other cells. PGCs axe the source from which GSCs and ES cells 

are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 

not only populate the germ line and give rise to a plurality of terminally differentiated cells that 

comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 

modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 

sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 

include, but are not limited to, promoters, and promoter modulating sequences (inducible 

elements). One class of EMFs are nucleic acid fragments which induce the expression of an 

operably linked ORF in response to a specific regulatory factor or physiological event. 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 

"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 

sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 

origin which may be single-stranded or double-stranded and may represent the sense or the 

antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material. In the 

sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 

(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 

provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 

invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 

from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 

acid which is capable of being expressed in a recombinant transcriptional unit comprising 

regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 

"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 

residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 

more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 

most preferably at least about 1 7 nucleotides. The fragment is preferably less than about 500 

nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 

nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 

nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 

preferably from about 1 5 to about 50 nucleotides, more preferably from about 17 to 30 
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nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-20. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 :241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO:l -1 786 and 3573-5358. The 
sequence information can be a segment of any one of SEQ ID NO.1-1786 and 3573-5358 that 
uniquely identifies or represents the sequence information of that sequence of SEQ ID NO:l- 
1786 and 3573-5358. One such segment can be a twenty-mer nucleic acid sequence because the 
probability that a twenty-mer is fully matched in the human genome is 1 in 300. In the human 
genome, there are three billion base pairs in one set of chromosomes. Because 4 20 possible 
twenty-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of 
human chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully 
matched in the human genome is approximately 1 in 5. When these segments are used in arrays 
for expression studies, fifteen-mer segments can be used.- The probability that the fifteen-mer is 
fully matched in the expressed sequences is also approximately one in five because expressed 
sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-fivemer. The probability that the twenty-five mer would appear in a human genome 
with a single mismatch is calculated by multiplying the probability for a full match (l-r4 25 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 

8 
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The term "open reading frame," ORF, means a series of nucleotide triplets coding for 
amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 1 7 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 1 50 amino acids and most preferably less than 1 00 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the full 
length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
produced by processing in the cell which removes any leader/signal sequence. The mature 
protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 
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The term "derivative" refers to polypeptides chemically modified by such techniques as 
ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 

10 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
for expression, scale up and the like in the host cells chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
rnacromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 
polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological rnacromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 
at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially free of native endogenous substances and 
unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., K coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 
or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
appropriate transcription initiation and termination sequences. Structural units intended for use 

11 



WO 01/53312 PCT/USOD/34263 
in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
5 recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 
a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
10 to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
1 5 can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
{e.g., soluble proteins) or partially {e.g., receptors) from the cell in which they are expressed. 
20 "Secreted" proteins also include without limitation proteins that are transported across the 
membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 
proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney,*P.A. and 
Young, P.R. (1992) Cytokine 4(2):134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et al. (1998) Annu. Rev. Immunol. 
25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

30 The term "stringent" is used to refer to conditions that are commonly understood in the 

art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 

3 5 described herein in the examples. 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 
sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
35% (i.e., the number of individual residue substitutions, additions, and/or deletions in a 
substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 
listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 
by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 
sequences according to the invention preferably have at least 80% sequence identity with a listed 
amino acid sequence, more preferably at least 90% sequence identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, and most 
preferably at least about 95% identity. For the purposes of the present invention, sequences 
having substantially equivalent biological activity and substantially equivalent expression 
characteristics are considered substantially equivalent For the purposes of determining 
equivalence, truncation of the mature sequence (e.g., via a mutation which creates a spurious 
stop codon) should be disregarded. Sequence identity may be determined, e.g., using the Jotun 
Hein method (Hein, J. (1990) Methods Enzymol. 183:626-645). Identity between sequences can 
also be determined by other methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
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term "transfection" refers .to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 ; a polynucleotide encoding any one 
of the peptide sequences of SEQ ID NO:1787-3572 and 5359-7144; and a polynucleotide 
comprising the nucleotide sequence encoding the mature protein coding sequence of the 
polypeptides of any one of SEQ ID NO:1787-3572 and 5359-7144. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID NO:l- 
1786 and 3573-5358 ; (b) nucleotide sequences encoding any one of the amino acid sequences 
set forth in the Sequence Listing; (c) a polynucleotide which is an allelic variant of any 
polynucleotide recited above; (d) a polynucleotide which encodes a species homolog of any of 
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of the polypeptides of SEQ ID NO: 1787-3572 and 5359-7144. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 
receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 
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The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

5 The present invention also provides genes corresponding to the cDNA sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3' sequence can 

10 be o b tained using methods known in the art. For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO:l-1786 and 3573-5358 can be obtained 
by screening appropriate cDNA or genomic DNA libraries under suitable hybridization conditions 
using any of the polynucleotides of SEQ ID NO: 1-1786 and 3573-5358 or a portion thereof as a 
probe. Alternatively, the polynucleotides of SEQ ID NO: 1-1786 and 3573-5358 may be used as the 

1 5 basis for suitable primer(s) that allow identification and/or amplification of genes in appropriate 
genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 

20 representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides- including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 

25 75%, at least about 80%, more typically at least about 90%, and even more typically at least 
about 95%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1-1 786 and 3573-5358, or complements thereof, which fragment is greater than 

30 about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and 

most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more 
that are selective for (i.e. specifically hybridize to any one of the polynucleotides of the 
invention) are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 
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the same family of genes or can differentiate human genes from genes of other species, and are 
preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specific sequences, but also include allelic and species variations thereof. Allelic and species 
5 variations can be routinely determined by comparing the sequence provided SEQ ID NO: 1-1786 
and 3573-5358, a representative fragment thereof, or a nucleotide sequence at least 90% identical, 
preferably 95% identical, to SEQ ID NO:l-1786 and 3573-5358 with a sequence from another 
isolate of the same species. Furthermore, to accommodate codon variability, the invention includes 
nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed 

1 0 herein. In other words, in the coding region of an ORF, substitution of one codon for another codon 
that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present invention, 
including SEQ ID NO: 1-1786 and 3573-5358, can be obtained by searching a database using an 
algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool 

1 5 is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and 
Altschul S.F. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a FASTA version 3 search 
against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 

20 suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 

25 polynucleotides. 

' The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 

30 sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 

3 5 will typically be modified in series, e.g., by substituting first with conservative choices {e.g. , 
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hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 
choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site. Amino acid sequence deletions generally range from about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
5 insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 
residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 
preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
sequences necessary for secretion or for intracellular targeting in different host cells and 

10 sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 
In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 

15 site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et aL, 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1 982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 

20 When small amounts of template DNA are used as starting material, primer(s) that differs 

slightly in sequence from the corresponding region in the template DNA can generate the desired 
amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 

25 gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 

30 code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 
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Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 
domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or KNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 
conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO: 1-1 786 and 3573-5358, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that direct 
the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. 
Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 
plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Victors according to the invention include expression 
vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO:l -1 786 and 3573-5358 or a fragment 
thereof or any other polynucleotides of the invention. In one embodiment, the recombinant 
constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 
which a nucleic acid having any of the nucleotide sequences of SEQ ID NO.i-1786 and 3573- 
5358 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector 
comprising one of the ORFs of the present invention, the vector may further comprise regulatory 
sequences, including for example, a promoter, operably linked to the ORF. Large numbers of 
suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the recombinant constructs of the present invention. The following 
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vectors are provided by way of example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, 
pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); P Trc99A, pKK223-3, pKK233-3, 
pDR540, pRIT5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXT1, pSG (Stratagene) 
pSVK3, pBPV, pMSG, pSVL (Pharmacia). 
5 The isolated polynucleotide of the invention may be operably linked to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et aL, 
Nucleic Acids Res. 1 9, 4485-4490 (1991), in order to produce the protein recombinantly. Many 
suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 

10 Enzymology 185, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 

1 5 transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 

20 Generally, recombinant expression vectors will include origins of replication and selectable 

markers permitting transformation of the host cell, e.g. , the ampicillin resistance gene of E. coli 
and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 
transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 

25 phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation, and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

30 characteristics, e.g. , stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors, for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 

35 vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
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transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, and Staphylococcus , although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
5 can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 

1 0 sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

15 Polynucleotides of the invention can also be used to induce immune responses. For 

example, as described in Fan et aL, Nat. Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 

20 sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 

43 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
25 are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO:l-i786 and 3573-5358, or fragments, analogs or derivatives thereof. 
An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" 
nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded 
cDNA molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic 
30 acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 
50, 100, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic 
acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO: 1787-3572 and 5359-7144 or antisense nucleic acids complementary to a nucleic 
acid sequence of SEQ ID NO: 1-1 786 and 3573-5358 are additionally provided. 
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In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5' and 3' sequences which flank the coding region that are not 
translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions). 

. Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO:l -1786 and 3573-5358 , antisense nucleic acids of the invention can be designed according 
to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of a mRNA, but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region of a mRNA. 
For example, the antisense oligonucleotide can be complementary to the region surrounding the 
translation start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 
15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention 
can be constructed using chemical synthesis or enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can 
be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 
physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., 
phosphorothioate derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
inosine,N6-isopentenyladenine s 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, S'-methoxycarboxymethyluracil, 5~methoxyuracil, 

2- methylthio-N6-isopentenyIadenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-ammo-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the 
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inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
5 genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 
protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 

10 nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 

15 receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 

20 a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

double-stranded hybrids with complementary RNA in which, contrary to the usual P-units, the 
strands run parallel to each other(Gaultier et al (1987) Nucleic Acids Res 15: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2'-o-methylribonucleotide (Inoue et al. 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue etal (1987) 

25 FEBSLett2\5: 327-330). 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 

30 single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO:l- 

35 1786 and 3573-5358). For example, a derivative of a Tetrahymena L-19 IVS RNA can be 
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constructed in which the nucleotide sequence of the active site is complementary to the 
nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., Cech et al U.S. Pat. 
No. 4,987,071 ; and Cech et al U.S. Pat. No. 5,1 16,742. Alternatively, SECX mRNA can be 
used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA 
5 molecules. See,<?.£, Bartel et al, (1993) Science 261:1411-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region {e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1 991) 
Anticancer Drug Des. 6: 569-84; Helene. et al (1 992) Ann. N. Y. Acad Set 660:27-36; and 
1 0 Maher (1 992) Bioassays 1 4: 807-1 5. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al (1996) BioorgMed 
1 5 Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shovm to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al (1996) above; 
Peny-O'Keefe et al (1 996) PNAS 93: 14670-675. 

it PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 
PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 
gene by, e.g, PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup etal (1996), above; Perry-O'Keefe (1996), 
above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 
stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
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portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 

using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 

the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 

can be performed as described in Hyrup (1996) above and Finn et al (1996) Nucl Acids Res 24: 

5 3357-63. For example, a DNA chain can be synthesized on a solid support using standard 

phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 

5^(4-methoxytrityl)amino-5 , -deoxy-thymidine phosphoramidite, can be used between the PNA 

and the 5' end of DNA (Mag et al. (1989) Nucl Acid Res 1 7: 5973-88). PNA monomers are then 

coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 

10 DNA segment (Finn et al (1996) above). Alternatively, chimeric molecules can be synthesized 

with a 5' DNA segment and a 3' PNA segment. See, Petersen et al (1975) Bioorg Med Chem 

Lett 5: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 

15 cell membrane (see, e.g:, Letsinger et al., 1989, Proc. Natl Acad Sci. U.S.A. 86:6553-6556; 
Lemaitre et al, 1987, Proc. Natl Acad. Sci. 84:648-652; PCT Publication No. W088/09S10) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 

20 5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 

peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent,- etc. 



4.5 HOSTS 

25 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

30 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

35 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
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the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 
DNA, ampliflable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
eukaryotic host cell, such as a yeast ceil, or the host cell can 'be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et al., Basic Methods in Molecular Biology (1 986)). The host cells containing one of the 
polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
The most preferred cells arc those which do not normally express the particular polypeptide or 
protein or which .expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, ct 
al., in Molecular Cloning: A Laboratory Manual,' Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 
protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the C 1 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 
from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
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HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the S V40 viral genome, for example, 
* 5 SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
10 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 

1 5 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 

20 may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
. attachments may be accomplished using known chemical or^enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

25 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 

30 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 
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protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
5 enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
10 sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
5 selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to r 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et ah; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 
PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising: the amino acid sequences set forth as any one of SEQ ID NO:1787-3572 and 5359- 
7144 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO:l- 
1786 and 3573-5358 or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides preferably with biological or immunological activity that are 
encoded by: (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID 
NO:l-1786 and 3573-5358 or (b) polynucleotides encoding any one of the amino acid sequences 
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set forth as SEQ ID NO:1787-3572 and 5359-7144 or (c) polynucleotides that hybridize to the 

complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 

The invention also provides biologically active or immunologically active variants of any of the 

amino acid sequences set forth as SEQ ID NO:1787-3572 and 5359-7144 or the corresponding 

5 full length or mature protein; and "substantial equivalents" thereof (e.g., with at least about 

65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 

about 90%, typically at least about 95%, more typically at least about 98%, or most typically at. 

least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by 

allelic variants may have a similar, increased, or decreased activity compared to polypeptides 

10 comprising SEQ ID NO: 1 787-3572 and 5359-7144. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et ah, Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
15 Chcm. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both fall-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
20 sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form / 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
25 provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are folly secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention farther provides isolated polypeptides encoded by the nucleic acid 
30 fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the art can be utilized to obtain any one of the 

isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 

sequence can be synthesized using commercially available peptide synthesizers. The 

synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 

5 structural and/or conformational characteristics with proteins may possess biological properties 

in common therewith, including protein activity. This technique is particularly useful in 

producing small peptides and fragments of larger polypeptides. Fragments are useful, for 

example, in generating antibodies against the native polypeptide. Thus, they may be employed 

as biologically active or immunological substitutes for natural, purified proteins in screening of 

10 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

15 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-afOnity chromatography. See, e.g., Scopes, Protein Pwification: Principles and 
Practice, Springer-Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratoiy 

35 Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
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retain biological/inununological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
5 the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
1 0 cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexcd with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO:1787-3572 and 5359-7144. 

The protein of the invention may also be expressed as a product of transgenic animals, 
e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat No. 4,518,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N J.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 
Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 
modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 
provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
1 0 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 
steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 

20 programs including, but are not limited to, the GCG program package, including GAP 

(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Aitschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Aitschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 

25 Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
105-31 (1982), incorporated herein by reference). The BLAST programs are publicly available 

30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Aitschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Aitschul, S., et al., J. Mol. 
Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

35 protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
5 portions of a protein according to the invention. Within the fusion protein, the term "operativcly 
linked" is intended to indicate that the polypeptide according to the invention and the other 
polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

10 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

1 5 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The 
irnmunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 

20 The irnmunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e,g. t cancer as well as modulating (e.g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
. give rise to complementary overhangs between two consecutive gene fragments that can 

35 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g. , a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in-frame to the protein of the invention- 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

1 0 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex v/vo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 

1 5 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1 998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verraa, Scientific 
American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosornal substrates (transient expression) or 

20 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 

3 0 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 

34 



WO 01/53312 PCT/USOO/34263 
the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operati vely linked to the desired protein encoding sequences. See, for example, PCT International 
PublicationNo. WO 94/12650, PCT International Publication No. WO 92/20808, and PCT 
International PublicationNo. WO 9 1/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase,and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 
replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
protein produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl-transferase(gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 
U.S. Patent No. 5,578,461 to Sherwin et ah; International ApplicationNo. PCT/US92/09627 
(WO93/09222)by Selden et al.; and International ApplicationNo. PCT/US90/06436 
(WO91/06667)by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
refen-ed to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 
replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 

known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
1 0 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
15 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, axe produced using methods as described in U.S. Patent No 5,489,743 and PCT 
20 Publication No. W094/28122, incorporated herein by reference. 

Transgenic anirtials can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 
25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
5 polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 

1 0 indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 

15 or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 

20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 

25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 

30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 

35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
•determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent grade or 
kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. R Fritsch 
and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
sources or supplements. Such uses include without limitation use as a protein or amino acid 
supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

4.103 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 
or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, T1165, HT2, CTLL2, TF-1, Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 

Assays for T-cell or thymocyte proliferation include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et ah, J. Immunol. 137:3494-3500, 1 986; Bertagnolli et al., J. Immunol. 
145:1706-1712, 1990; Bertagnolli et aL, Cellular Immunology 133:327-341, 1991; Bertagnolli, 
et al., I. Immunol. 149:3778-3783, 1992; Bowman et al., I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
dcVries etal., J.Exp. Med. 173:1205-1211, 1991; Moreau et al., Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 
U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 1 1 -Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
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Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al., Proc. Natl Acad Sci. USA 77:6091-6095, 
1980; Weinberger et aL, Eur. J. Immun. 11:405-41 1, 1981; Takai et al., J. Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 

4 .10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 
cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 
large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 
for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the indention to achieve the desiredr 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt- 
3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 
inflammatory protein 1 -alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 
these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the invention. This will allow for generation of 

undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 

1 0 identification of differentially expressed genes in stem cell populations that regulate stem cell 
proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 

1 5 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 

20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on-stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 

25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 

30 Academic Press (1 997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 

35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al. Blood, 77: 2316-2321 (1991). 



4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

10 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 

15 to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 
treat consequent myelo-suppression; in supporting the growth and proliferation of 
megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

20 various platelet disorders such as thrombocytopenia, and generally for use in place of or 

complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et ah eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 
Proc. Natl. Acad. Sci. USA 89:5907-5911, 1992; Primitive hematopoietic colony forming cells 
with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al. Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 
Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of bums, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 
artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 
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Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 

i 

other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
5 humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 
use in the improved fixation of tendon or ligament to hone or other tissues, and in repairing 
defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 

1 0 other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 
provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 

1 5 tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 
an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

20 nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

25 lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 

30 Compositions of the invention may also be useful to promote better or faster closure of 

non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 

35 kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
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endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 
A composition of the present invention may also be useful for gut protection or 
5 regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described abovefrom precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 
1 0 Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 
1 5 Assays for wound healing activity include, without limitation, those described in: Winter, 

Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.)> Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 

20 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described - 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 

25 severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 

proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

30 treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
Autoimmune disorders which may be treated using a protein of the present invention 

35 include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 



WO 01/53312 PCT/US00/34263 
rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
5 reactions and conditions (e.g. , anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens-Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
10 (particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 
15 1998), skin prick test (Hoffmann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
test (Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
20 immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells r or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 
in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 
followed by an immune reaction that destroys the transplant. The administration of a therapeutic 

47 
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composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 
rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
aL, Science 257:789^792 (1992) and Turka et aL, Proc. Natl. Acad Sci USA, 89: 1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 
compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines andautoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus eiythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 
infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 

48 
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Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
5 patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

10 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

1 5 MHC class I alpha chain protein and j3 2 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 
proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

20 an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

25 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

30 ■ Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61 :1992-1998; Bertagnolli et al., 

35 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 



r 

f 

WO 01/53312 PCT/USOO/34263 
Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-ceU dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
5 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed Jymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
10 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et aL, J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
1 5 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
20 Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, ^proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
25 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991 ; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
30 include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine et al., 
Cellular Immunology 155:1 11-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Told et al., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACTIVIN/INHIBIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 

5 release of follicle stimulating.hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 

10 homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 

1 5 animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et aL, Nature 

20 321 :776-779, 1986; Mason et al., Nature 3 1 8:659-663, 1 985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemptactic or chemokinetic 
25 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
30 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
3 5 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
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Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
5 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
10 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6,12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

15 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 

20 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

25 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474,1988. 

30 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
35 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
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may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 
inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 
administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as a 
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutical^ 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
with the polypeptide or modulator of the invention include: Actinomycin D, Aminoglutethiinide, 
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Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (V16-213), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 
5 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexame%lmelamine, Interleukin-2, Mitoguazone, Pentostatin, 

1 0 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

15 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 

20 tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 52: 921-30 

(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction T 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 1 89-97 (1999) and Li et aL, 

25 Clin. Exp. Metastasis, 17:423-9 (1 999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

4.10-12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
30 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
35 integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
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recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
5 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 

the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 

Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
1 0 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 

Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 

Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1 145-1 156, 1988; 

Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 

175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 
15 By way of example, the polypeptides of the invention may be used as a receptor for a 

ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 

through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel 

overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
20 partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 

present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 

coloiimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 

Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 

Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
25 carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 

molecules such as fhiorescamine, or rhodamine or other colorimetric molecules. Examples of 

toxins include, but are not limited, to ricin. 

4.10.13 DRUG SCREENING 

30 This invention is particularly useful for screening chemical compounds by using the 

novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 
utilizes eukaiyotic or prokaryotic host cells which are stably transformed with recombinant 

35 nucleic acids expressing the polypeptide or a fragment thereof Drugs are screened against such 
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transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 
diminution in complex formation between the novel polypeptides and an appropriate cell line, 
5 which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 
1 0 Chemical libraries may be readily synthesized or purchased from a number of 

commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
1 5 screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 252:63-68 (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
20 organic compounds and can be readily prepared by traditional automated synthesis methods, 
PGR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin 
25 Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):114-19 (1997); Domer et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hif ' (or "lead'*) to optimize the capacity of the "hit" to bind a 
30 polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
35 cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
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molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

5 4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 

1 0 to identify polynucleotides encoding binding partners. As another example, affinity 

chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (z.e., increase or decrease) biological activity of a polypeptide of the invention. 

1 5 Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not. The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

20 polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1*) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

25 The role of downstream intracellular signaling molecules in the signaling cascade of the 

polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 

30 the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 
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Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved hy providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
5 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 

10 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1. Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

1 5 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 

20 intrauterine infections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
25 invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, promyelocyte, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et aL, 1985, Medicine, 2d Ed., IB. Lippincott Co., Philadelphia). 

30 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
35 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
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disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
5 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
10 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
inununodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

1 5 tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

20 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
inoluding but not limited to, vitamin B 12 deficiency, folic acid deficiency, Wernicke disease, * 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
caliosum), and alcoholic cerebellar degeneration; 

25 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

3 0 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

35 system disorder may be selected by testing for biological activity in promoting the survival or 
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differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

5 (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g. , 

choline acetyltransfexase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

1 0 forth in Arakawa et al. (1 990, J. Neurosci. 1 0:3 507-3 5 1 5); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 

1 5 assessing the physical manifestation of motor neuron disorder, e.g. , weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 

20 well as other components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 

25 (Charcot-Marie-Tooth Disease). 



4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
3 0 including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 
35 subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
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elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or component(s); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
5 reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymcs 9 correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
10 in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 

15 polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 

20 polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. * * 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 

25 involving isolation or amplification of the DNA, and identifying the presence of the 

polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 

30 single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 

adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 

35 present invention can be used to detect polymorphisms. The array can comprise modified 
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nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
5 also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 

4.1020 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
10 arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1 983, 
Science, 219:56, or by B. Waksman et al, 1963, Int. Arch. Allergy Appl. Immunol, 23:129. 
Induction of the disease can be caused by a single injection, generally intradermally, of a 
suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
15 route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 

mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradermally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
20 test compound and subsequent treatment every other day until day 24. At 14, 1 5, 1 8, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

25 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
30 include, but aTe not limited to, those exemplified herein. 

4.1 U EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
35 disorder that can be modulated by regulating the peptides of the invention. While the mode of 
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administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
5 condition and response of the individual patient. Typically, the amount of polypeptide 

administered per dose will be in the range of about 0.01^g/kg to 100 mg/kg of bo(Jy weight, with 
the preferred dose being about O.l^ig/kg to 10 mg/kg of patient body weight. For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutical^ acceptable parenteral vehicle. Such vehicles are well known in the art 
10 and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient 
The preparation of such solutions is within the skill of the art. 



1 5 4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 

20 to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art The term 
"pharmaceutical^ acceptable" means a non-toxic material that does not interfere with the 

25 effectiveness of the biological activity of the active ingredients). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 
M-CSF, GM-CSF, TNF, BL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, DL-9, IL-10, DL-1 1 , IL-12, 
IL-13, IL-14, IL-15, BFN, TNF0, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 

30 factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
factor (PDGF), transforming growth factors (TGF-a and TGF-p), insulin-like growth factor 
(IGF), as well as cytokines described herein. 
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The pharmaceutical composition may further contain other agents which either enhance 

the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of the 
5 invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or antithrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or antithrombotic factor, or anti-inflammatory agent (such as 

10 IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 
As an alternative to being included in a pharmaceutical composition of the invention 

1 5 including a first protein, a second protein or a therapeutic agent may be concurrently 

administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 

20 edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or t 
amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 

25 combination, a therapeutically effective dose refers to combined amounts of the active 

ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
effective amount of protein or other active ingredient of the present invention is administered to 

30 a mammal having a condition to be treated. Protein or other active ingredient of the present 

invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 

35 administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
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factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

5 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 

10 intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

15 Alternately, one may administer the compound in a local rather than systemic manner, for 

example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 

20 system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
1 afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
25 an effective dosage for a particular indication is within the level of skill in the art Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician.to provide maximal therapeutic benefit. 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
35 preparations which can be used pharmaceutical^. These pharmaceutical compositions may be 

65 



WO 01/53312 PCT/US00/34263 

manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
5 invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 
the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 

10 other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 
soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 

1 5 When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 

20 other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 

25 present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 

30 preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
bairier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the . 
35 active compounds with pharmaceutically acceptable carriers well known in the art. Such carriers 
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enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
5 suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
10 may be added, such as the cross-linked polyvinyl pynrolidone, agar, or alginic acid or a salt 
. thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pynrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuflfe or pigments may be 
1 5 added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 
an inhaler or insufflator may be fonnulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
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an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co^solvent 
system comprising benzyl alcohol, anonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
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known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
5 Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

10 The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 

1 5 acceptable base addition salts are those salts which retain the biological effectiveness and 

properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

20 The pharmaceutical composition of the invention may be in the form of a complex of the 

protein(s) or other active ingredient(s) of present invention along with protein or peptide 
1 antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 

25 presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 

30 well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 

35 micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
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lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
5 herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
1 0 ingredient of the present invention with which to treat each individual patient Initially, the 

attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
1 5 various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 yg to about 100 mg (preferably about 0.1 \xg to about 10 mg, more preferably 
about 0.1 \xg to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
20 topically, systematically, or locally as an implant or device. When administered, the therapeutic, 
composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for * 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
25 active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 
30 cartilage damage, providing a structure for the developing bone and cartilage and optimally 

capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
35 compositions will define the appropriate formulation. Potential matrices for the compositions 
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may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 

hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 

are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 

matrices are comprised of pure proteins or extracellular matrix components. Other potential 

5 matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 

aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 

mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 

tricalcium phosphate. The bioceramics may be altered in composition, such as in 

calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 

10 biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 

In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 

cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 

the matrix. 

15 A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 

(including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 

20 polyethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and polyvinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 

25 protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-oc and TGF-P), and 

30 insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 

35 regeneration will be determined by the attending physician considering various factors which 
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modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue {e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 
other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 
.„ effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 
appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve, a circulating 
concentration range that includes the IC 5 o as determined in cell culture {i.e., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., for determining the LD 50 (the dose lethal to 50% of the 
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population) and the ED 30 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD 50 and ED 50 . Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
5 range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED 50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et al., 1975, in "The 
10 Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
1 5 bioassays can be used to determine plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local- 
administration or selective uptake, the effective local concentration of the drug may not be 
20 related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0.01 ng/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 jig/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
25 intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

30 4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
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invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 



4.13 ANTIBODIES 

5 Also included in the invention are antibodies to proteins, or fragments of proteins of the 

invention. The term "antibody as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fab, F a b' and F(ayj2 

1 0 fragments, and an F a b expression library. In general, an antibody molecule obtained from 

humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGi, IgG2, and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 

1 5 subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 

20 invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

antigenic peptide fragment comprises at least 6 amino, acid residues of the amino acid sequence 
of the full length protein, such as an amino^acid sequence shown in SEQ ID NO: 1787, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 

25 Preferably, the antigenic peptide comprises at least 1 0 amino acid residues, or at least 1 5 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 

30 antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 

35 may be generated by any method well known in the art, including, for example, the Kyte 
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Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
Hopp and Woods, 1981, Proc. Nat Acad. Sol USA 78: 3824-3828; Kyte and Doolittle 1982, J. 
Mol BioL 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
fragments, analogs or homologs thereof, are also provided herein, 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
monoclonal antibodies directed against a protein of the invention, or against derivatives, 1 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

5.13.1 Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 
recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immunological response include, but are not 
limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
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target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature, 256:495 (1 975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 
are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
, desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: 
Principles and Practice. Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
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California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol., 133:3001 (1984); Brodeur et al., Monoclonal 
Antibody Production Techniques and Applications. Marcel Dekker, Inc., New York, (1987) pp. 
5 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
10 enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochem.. 107:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

15 After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Duibecco's Modified Eagle's Medium and RPMI-1 640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 

20 medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 
example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. * 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 

25 invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 

30 myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,8 1 6,567; Morrison, Nature 368. 
812-13(1 994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 

35 coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
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polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13,2 Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further comprise 
humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab 1 , F(ab')2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
Nature, 321 :522-525 (1986); Riechmann et al., Nature. 332:323-327 (1988); Verhoeyen et al., 
Science, 239:1534-1 536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 
domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al., 1 986; Riechmann et al., 1988; and Presta, Curr. Op. Struct. BioL. 
2:593-596 (1992)). 

5.133 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "frilly human antibodies'* herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1 985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
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antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et al, 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-ceUs with Epstein Bair Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 
5 In addition, human antibodies can also be produced using additional techniques, 

including phage display libraries (Hoogenboom and Winter, J. MoL Biol., 227:381 (1991); 
Marks et al., J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
10 challenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technolo^ v 10, 779-783 (1992)); Lonberg et al. 
(Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et al,( Nature 
15 Biotechnology H, 845-51 (1996)); Neuberger (Nature Biotechnolog y 14, 826 (1996)); and 
Lonberg and Huszar qntern. Rev. Immunol. 13 65-93 (1 995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
20 endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 
have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
25 transgenic animals containing fewer than the full complement of the modifications. The 

preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
30 polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 
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An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 
expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
5 locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 
and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 

10 U.S. PatentNo. 5,916,771. It includes introducing an expression vector that contains a 

nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
an expression vector containing a nucleotide sequence encoding a light chain into another 
mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

15 In a further improvement on this procedure, a method for identifying a clinically relevant 

epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

20 5.13.4 F flb Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F a b expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 

25 monoclonal F ab fragments with the desired specificity for a protein or derivatives, fragments, 

analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
may be produced by techniques known in the art including, but not limited to: (i) an F^ 
fragment produced by pepsin digestion of an antibody molecule; (ii) an F a b fragment generated 
by reducing the disulfide bridges of an F (ab )2 fragment; (iii) an F a b fragment generated by the 

30 treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

5.13-5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
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binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 

Methods for making bispecific antibodies are known in the art Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 
potential mixture of ten different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker et al, 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
aL, Methods in Enzvmologv. 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 
F(ab*)2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan et aL, Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 
fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
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stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
derivatives is then reconverted to the Fab '-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
5 antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et ah, J. Exp. Med 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(aV>2 molecule. Each Fab' fragment 

1 0 was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 

1 5 recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et al., J. Immunol. 1 48(5): 1 547- 1 553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab* portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 

20 also be utilized for the production of antibody homodimers. The "diabody" technology 

described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (Vh) connected to a light-chain variable domain (Vl) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 

25 the V H and V L domains of one fragment are forced to pair with the complementary Vl and Vh 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et al., J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 

30 antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, of B7), or Fc receptors for 

35 IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRm (CD1 6) so as to focus cellular 
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defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
5 " binds the protein antigen described herein and further binds tissue factor (TF). 



5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 

1 0 have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 

1 5 Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 



5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
20 to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 1 76: 1191-1 195 (1992) 
25 and Shopes, J. Immunol, 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifimctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

30 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
35 radioconjugate). 
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Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
5 Aieurites fordii proteins, dianthin proteins, Phy tolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 
radionuclides are available for the production of radioconjugated antibodies. Examples include 
2l2 Bi, ,3l I, I3, In, 90 Y,and l86 Re. 
1 0 Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 

protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
15 bis-(p-diazoniumbenzoyl>ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 
Carbon- 14-labeled I-isothiocyanatobenzyI-3-melhyldiethyIene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
20 WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
25 conjugated to a cytotoxic agent. 

4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 

30 any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 

35 be used to create a manufacture comprising computer readable medium having recorded thereon 
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a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 
5 A variety of data storage structures are available to a skilled artisan for creating a 

computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 

1 0 readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats {e.g. text file or database) in order to obtain computer readable medium having recorded 

1 5 thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-1786 and 3573-5358 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. Computer 

20 software is publicly available which allows a skilled artisan to access sequence information 
provided in a computer readable medium. The examples which follow demonstrate how 
software which implements the BLAST (Altschul et ah, J. Mol. Biol. 215:403-410 (1990)) and 
BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system 
is used to identify open reading frames (ORPs) within a nucleic acid sequence. Such ORJFs may 

25 be protein encoding fragments and may be useful in producing commercially important proteins 
such as enzymes used in fermentation reactions and in the production of commercially useful 
metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 

30 present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
computer-based systems of the present invention comprise a data storage means having stored 

35 therein a nucleotide sequence of the present invention and the necessary hardware means and 
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software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 
5 As used herein, "search means" refers to one or more programs which are implemented 

on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 

1 0 available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith- Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 

15 computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 

20 residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
25 three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

30 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
35 Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
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designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al, Science 15241 :456 (1988); and Dervan 
et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
5 Raton, FL (1 988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 

polypeptide. Both techniques have been demonstrated to be effective in model systems. 

Information contained in the sequences of the present invention is necessary for the design of an 

antisense or triple helix oligonucleotide. 

10 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
15 with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
• detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is ' 
detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
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probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1986); Bullock, G.R. et aL, Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
5 and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 
10 extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
1 5 provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
20 containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents- are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
25 sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
30 reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 

4J7 MEDICAL IMAGING 
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The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et aL, U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a laheling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceuticaUy acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO:l- 
1 786 and 3573-5358, or bind to a specific domain of the polypeptide encoded by the nucleic 
acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the present 
invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 
the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et 
al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORPs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 

4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO:l-1786 and 3573-5358. Because the corresponding gene is only 
expressed in a limited number of tissues, a hybridization probe derived from of any of the 
nucleotide sequences SEQ ID NO: 1-1 786 and 3573-5358 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,1 88 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
pro^e will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
5 of genetic map data can be found in the 1994 Genome Issue of Science (265:1981f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

10 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesiser. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 
1 5 skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagataef a/., 1985;Dahlene/a/., 1987; Morrissey& Collins, (1 989) Mol. Cell 
Probes 3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et aL, 1 988; 1 989); all 
20 references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude et aL (1 994) Proc. Natl. Acad Sci. USA 91 (8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 
streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 
25 Of course, this same Unking chemistry is applicable to coating any surface with streptavidin. 
Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (NapervilIe,IL) is also selling suitable material that could be used Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
30 surface termed Covalink NH. CovaLink NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5-end by a phosphoramidatebond, allowing immobilization of more than 1 pmol of DNA 
(Rasmussentf a/., (1991) Anal. Biochem. 198(1) 138-42). 
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The use of CovaLink NH strips for covalent binding of DNA molecules at the 5'-end has 
been described (Rasmussenet al, (1991). In this technology, aphosphoramidatebond is employed 
(Chu et al., (1983) Nucleic Acids Res. 1 1(8) 65 13-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidatebond joins the DNA to the 
CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must have a 5-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 

More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 
denaturing for lOmin. at 95°C and cooling on ice for lOmin. Ice-cold 0.1 M 1-methylimidazole, 
pH 7.0 (1-Melm 7 ), is then added to a final concentration of 10 raM 1-Melm 7 . A ss DNA solution is 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 

Carbodiimide 0.2 M l-e^l-3-(3-dime%laminopropyl)-carbodiimide(EDC), dissolved in 
1 0 mM 1 -Melm 7 , is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present inventionis that 
described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3'-reagent through the phosphate group by a covalent phosphodiester link to*aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 
conditions that do not cleave the oligonucleotide from the support Suitable reagents include 
nucleoside phosphoramiditeand nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
VodoretaL (1991) Science 251(4995) 767-73, incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by Van Ness et al (1991) Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 
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To link an oligonucleotide to a nylon support, as described by Van Ness et al (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5 -amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease et al, (1 994) PNAS USA 91 (1 1) 5022-6, incoiporated 
herein by reference). These authors used currentphotolithographic techniques to generate arrays of 
immobilized oligonucleotideprobes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotideprobes in high-density, miniaturized arrays, utilize photolabile 
5-protected Macyl-deoxynucieosidephosphoramidites, surface linker chemistry and versatile 
combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotideprobes may be 
generated in this manner. 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 
The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
including mRNA without any amplification steps. For example, Sambrook et al (1 989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in Ml 3, plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplificationmethods. Samples 
may be prepared or dispensed in multiwell plates. About 1 00- 1 000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrooke/ 
al (1989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 
Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 
fragmentationmethods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv/JI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease CviJl normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
this enzyme (Cv/JI* *), yield a quasi-random distribution of DNA fragments fonn the small 
moleculepUC19 (2688 base pairs). Fitzgerald etai (1992) quantitatively evaluated the 
randomness ofthis fragmentation strategy, using a Cv/JI** digest ofpUC19 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
M13 cloning vector. Sequence analysis of 76 clones showed that CvzJI** restricts pyGCPy and 
PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
quickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 
chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
an-ay of wells in a microtiterplate) to repeated by transfer of about 20 nl of a DNA solution to a 
nylon membrane. By offset printing, a density of dots higher than the density of the wells is 
achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 
subarrays may represent replica spotting of the same samples. In one example, a selected gene 
segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-weIl plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 
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Subarraysmay contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, die grid 
5 being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
1 0 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
1 5 variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present prefenred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated by 
reference in their entirety. 

20 5.0 EXAMPLES 

5.1.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 
A plurality of novel nucleic acids were obtained from cDN A libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 

25 using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 

inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotideprobes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 

30 In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 

Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosy stems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDNA Ends) was performed to further extend the sequence in the 5 5 direction. 
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5.1.2 EXAMPLE 2 
Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 3573-5358 
were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
5 the seed EST into an extended assemblage, by pulling additional sequences from different databases 
(i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 114, and UniGene 
version 101) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 
component sequences into the assemblage was based on a BLASTN hit to the extending assemblage 
1 0 with BLAST score greater than 300 and percent identity greater than 95%. 

A polypeptide was predicted to be encoded by each of SEQ ID NO:3573-5358 as set forth 
below. The polypeptides was predicted using a software program called FASTY (available from 
http://fastabioch.virginia.edu> which selects a polypeptides based on a comparison of translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 1 83:63-98 
15 (1 990), herein incorporated by reference. The predicted polypeptides are shown in Table 7. 

5.2.2 EXAMPLE 3 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 

20 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was . 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 
ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-lengthnucleotide, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS : 1 - 327. 

25 Table 1 shows the various tissue sources of SEQ ID NO: 1-327. 

The nearest neighbor results for SEQ ID NO: 1-327 were obtained by a FASTA version 3 
search against Genpept release 1 17, using FASTXY algorithm. FASTXY is an improved 
version of FASTA alignment which allows in-codon frame shifts. The nearest neighbor result 
showed the closest homologue for SEQ ID NO: 1-327 from Genpept . The translated amino acid 

30 sequences for which the nucleic acid sequence encodes are shown in the Sequence Listing. The 
nearest neighbor results for SEQ ID NO: 1-327 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Cornp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
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signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. * 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 

10 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

15 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.3.2 EXAMPLE 4 

20 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 117, gbpri 117, 

25 UniGene version 117, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 328-1 41 3. 
Table 1 shows the various tissue sources of SEQ ID NO: 328-1413. 

30 The nearest neighbor results for SEQ ID NO: 328-1413 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 1 18, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 328-1413 from Genpept 
The translated amino acid sequences for which the nucleic acid sequence encodes are shown in 
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the Sequence Listing. The nearest neighbor results for SEQ ID NO: 328-1413 are shown in 
Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), ail the sequences were 
5 examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position® of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
10 • examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI . 1 program (from 

15 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

20 reference. A maximum S score and a mean S score, as described in the Nielson et as reference* 
was obtained for the polypeptide sequences.- Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 



25 5.3.2 EXAMPLES 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
30 checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 17, gb pri 117, 

UniGene version 11 7, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NO 1414-1652. 
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Table 1 shows the various tissue sources of SEQ ID NO: 1414-1652. 
The nearest neighbor results for SEQ ID NO: 1414-1652 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 118, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 1414-1652 from 
5 Genpept. The translated amino acid sequences for which the nucleic acid sequence encodes are 
shown in the Sequence Listing. The nearest neighbor results for SEQ ID NO: 1414-1652 are 
shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 
Biol., Vol. 6 pp. 21 9-235 (1 999) herein incorporated by reference), all the sequences were 
1 0 -examined to determine whether they had identifiable signature regions. Table 3 shows the 

signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
1 5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 

20 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
' disclosed by Henrik Nielsoft, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

25 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5,4.2 EXAMPLE 6 
30 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracei), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 18, gb pri 1 1 8, 
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UniGene version 1 1 8, Genpept release 1 1 8). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1653-1745. 
Table 1 shows the various tissue sources of SEQ ID NO: 1 653-1745. 
The homology for SEQ ID NO: 1653-1745 were obtained by a BLAST? version 2.0al 
19MP-WashU search against Genpept release 118, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1653-1745 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable functions for SEQ ID NO: 1653-1745 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et ah, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et a]., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
, their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5.2 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 9, gb pri 1 1 9, 
5 UniGene version 1 1 9, Genpept release 1 1 9). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Ina). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 1 746-1 768. 
Table 1 shows the various tissue sources of SEQ ID NO: 1746-1768. 
1 0 The homology for SEQ ID NO: 1 746-1 768 were obtained by a BLASTP version 2.0al 

1 9MP-WashU search against Genpept release 1 1 9, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1746-1768 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable functions for SEQ ID NO: 1746-1768 are shown in Table 2 below. 
15 Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 

Biol., Vol. 6 pp. 219-235 (1999) herein incorporatedby reference), all the sequences were examined 
to determine whether they had identifiable signature regions. Table 3 shows the signature region 
found in the indicated polypeptide sequences, the description of the signature, the eMatrix p- 
value(s) and the position(s) of the signature within the polypeptide sequence. 
20 Using the PFarn software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 

pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of r 
the domain found, the description, the p-value and the PFam score for the identified domain 
within the sequence. 

25 The nucleotide sequence within the sequences that codes for signal peptide sequences and 

their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication u 

30 Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 10, no. l,pp. 1-6 (1997), incorporated herein by reference. Amaximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 5 shows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 
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5.6.2 EXAMPLE 8 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
5 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 20, gb pri 120, 
UniGene version 120, Genpept release 120). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.) . The translated amino acid sequences for which the nucleic acid 
1 0 sequence encodes are shown in the Sequence Listing. The full-length nucleotide, including splice 
variants resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1769- 
1736. 

Table 1 shows the various tissue sources of SEQ ID NO: 1769-1786. 

The homology for SEQ ID NO: 1769-1786 were obtained by a BLASTP version 2.0al 
15 19MP-WashU search against Genpept release 120 and the amino acid version of Geneseq 
released on October 26, 2000, using BLAST algorithm. The results showed homologues for 
SEQ ID NO: 1769-1786 from Genpept. The homologues with identifiable functions for SEQ ID 
NO: 1769-1786 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 
20 Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 
25 pp. 320-322 (1 998) herein incorporated by reference) all the polypeptide sequences were 

examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
30 their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process • 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
35 cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
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reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 

each of the polypeptides and the maximum score and mean score associated with that signal 

peptide. 

5 Table 6 is a correlation table of all of the sequences and the SEQ ID NOS. 
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TABLE 1 



Tissue Origin 



acKilt brain 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



GIBCO 



AB3001 



adult brain" 



ABDOOr 



9 19-21 50-51 65-66 72 78 80 82 
85 87 107-108 113 116 123 138 
140 150-152 159 169 177 192-193 
202-203 212-214 225-226 235-236 
251 258 268-269 272 280-281 295 
298 301 321 326 331-332 334 356- 
357 362 369 379 382-383 416 423 
443 459-460 473 475 477 488 496 
500 503 519 526 547 574 S82 587 
608-609 613 618 633-634 645-646 
652 657-658 660 669-671 678 687 
695 697 710 715 724 731 775-777 
796 804 811 857-859 862 869 899- 
900 912 919 922 924-929 933 936 
962 979 968-989 996 1001 1004- 
1008 1018 1039 1047 1059 1064 
1067 1070 1078 1082 1107 1113 
1116-1117 1131 1134-1137 1140 
1149 1151 1157 1180 1206 1229 
1234 1241 1243 1258 1272-1273 
1279 1288-1290 1294 1307-1308 ■ 
1312 1320 1323 1330 1356 1360- 
1361 1368 1373-1375 1379 1391 
1400 1417 1446 1468 1482 1493- 
1494 1501-1503 1506-1507 1512 
1517 1522-1524 1530-1533 1537 
1549 1565 1578 1598 1606 1608 
1623 1625 1627 1639 1643 1648- 
1649 1653 1664 1667 1671 1696 
1734 1741 1743-1744 1760-1761 
1771 



3 12-14 18-19 25 30-31 34-!36 43- 
45 50-51 56 58 60 65-66 68-69 80 
82 8S 87 92 104 107-108 112-113 
115-116 123-124 131-132 135-137 
139 142 146 148-149 152 154 157 
159 163 165 167 169 172 180 192- 
193 196-197 199 203 208 210 212- 
214 223 233 235-237 247 257 259 
261 268-269 272 276 280-281 284- 
288 291-292 295 297 300-301 304 
307 317 320-321 323 327 329-331 
333-334 345-349 356-357 379-381 
393 401 408 414 419 424 426-42B 
430 433-436 438-439 443 445 449 
453-454 459-461 468 471-473 476- 
478 483 491 494 496 500 503 507- 
508 516 519-520 525-527 534 536- 
540 542-543 545 553 555 560 569- 
570 574-576 586-58B 593 595 S97 
601 606-609 616-620 622-623 625 
628-633 635-636 643 645-649 653 
655-656 660-665 668-670 676 681 
687 701 710 715 717 724-728 735 
743 745-746 753 753 759 765-766 
773 775-778 786 789 796 799-800 
802-803 810-811 815 817 820-821 
832 834-836 840 B45-B47 851 858- 
861 864 869 874 878 883 897 901- 
902 904-905 908 911-914 916 921- 
922 924-927 929 932-934 936-939 
941-942 945 955-958 963 966-969 
977 979-980 985-986 990 992-993 
997-1001 1005-1007 1012 1017- 
1020 1023-1024 1029-1031 1034 
1036 1039 1050 1059 1063-1066 
10781081-1082 1085-1086 1089 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult brain 



Clontech 



ABR001 



adult brain 



Clontech 



ABR006 



1097 1103 1107 1109 1112 1116- 
1117 1119 1121 1124 1127 1130 
1134 1144-1145 1149 1151 1157- 
1158 1167 1170 1178 1184 1188 
1190 1193-1194 1200 1202 1215- 
1217 1220 1226-1227 1229 1231 
1241 1243 1247 1252 1258 1263 
1267 1269 1279 1281 1284 1286- 
1289 1293-1294 1306-1307 1312 
1316-1320 1326 1333 1338 1341 
1344 1348 1351 1355-1357 1368 
1374 1377 1380 1386 1389-1390 
1394 1400 1409 1414 1422-1423 
1425-1427 1437 1443 1446 1454 
1456 145B-1459 1468 1470-1472 
1478 1482-1483 1487-1488 1493 
1497 1499 1506 1508-1511 1517 
1522-1524 1530-1533 1545-1546 
1548-1550 1552 1557-1559-1563 
1565 1567 1569 1571 1S86 1588 
1591 1593 1595 1598-1601 1608 
1611 1620-1621 1624-1626 1628 
1630-1632 1636 1640-1641 1644- 
1645 1647 1649 1653-1655 1657 
1664 1667 1669 1673 1678-1681 
1686 1690 1694-1696 1701 1709 
1711 1719 1722-1723 1726-1727 
1731-1733 1738 1740 1743-1744 
1747 1749 1753 1757-1758 1760- 
1761 1765 1771 1785 



9 29 68-69 113 115 146 152 206 
223 24S 277 307 320 324 330-331 
344 348 352 362 379 384 393 404 
408 414 441-442 454 469 481 490 
506 517 586 597 631 641 659 691 
715 799 003 833 865 871 875 880 
882 908 920 937 1000 1005-1006 
1027 1036 1041 1043 1075 1107 
1112 1121 1127 1136-1137 1144- 
1147 1231 1238-1239 1280 1293 
1320 1345 1355 1361 1383-1384 
1400 1417" 1448 1456 1476 1507 
1570 1572 1609-1610 1614 1620 
1626 1645 1653 1754 1759 1770 
1786 



adult brain 



5-8 15-16 168 212-213 271 278! 

280-281 291-292 300-301 310 314 
321 326 336-338 341 352 357 359- 
360 362 369 374 379 384 393 396- 
397 414 419-420 426-428 430 441- 
442 453 506 616-617 661 689 785 
798 845 1018 1109 1113 1124 1148 
1167 1187 1207 1227 1262 1265 
1285 1312 1317-1319 1324-1327 
1344 1369 1381 1400 1416 1421 
1427 1430-1431 1436 1471 1501 
1557-1559 1586 1588 1651 1653 
1664-1655 1671 1673 1690 1697- 
1698 1700 1711 1717 1719-1720 
1728 1736 1740 1743-1744 1757 
1760-1761 



Clontech 



ABR008 



10 13-19 22-23 25 29 33 37-39" 
43-45 50-51 54-55 57-53 60-66 
68-70 72 75 77-80 83 85 89-92 94 
99-105 108-110 112-113 116-117 
123 128 133 135-137 139 143 145- 
146 14B 152 154-155 157 166 168- 
172 174-175 181-184 188-190 193- 
194 196 198-200 202 204-205 207- 
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Tissue Origin 



RNA Source 



Hyaeq 
Library Name 



S2Q ID NOS: 



208 210 214-215 218 221-226 229"* 
231-232 234-241 24B-247 251-253 
255 257-259 268-269 271 276-281 
285-286 288 290-292 300-302 304 
307 309-311 313 315 317-318 320- 
322 325-326 328 330-331 333-338 
341 344-347 349 352 354 356-357 
362 369-373 376 379-380 382 384 
387 390-391 393-394 397 399-403 
405-411 414-415 417-420 426-428 
437-438 440-444 4S3-435 462 464 
467 469-471 476 478 482-484 4B8- 
491 497 503 506-513 516-517 520 
524-526 528-530 532-534 537-540 
542 544 547-551 553 561 S65-567 
572-574 577 581 585 587-588 590- 
591 597 599 601-602 606-610 612 
615-617 619-620 622-623 628-629 
631 633-634 636-641 643 645-647 
651-653 655-664 669-671 673 679 
682 687 689 691-700 702 706 710 
715-717 720-721 725-734 736-739 
742-743 746 750-752 756 758-759 
752-764 766 768 773-778 780-782 
734-785 787-789 794 796 799 802- 
803 805 811 814-815 818 825-826 
834-837 839-840 842-B43 856-859 
861-862 865 867-872 874-875 881 
883-884.887 889-892 894-895 897- 
898 901 904 908 910 912 914 917 
919 921-924 926-927 930-932 935- 
941943 945 949 9S3-954 958 961- 
963 967 969 971 975 977 981-983 
986 988-990 992 997 999-1002 
1004-1006 1008 1012 1018-1023 
1027 1029-1031 1035-1037 1047- 
1048 1053 1057 1059 1063 1068 
1070 1072-1075 1077 1081-1083 
1085-1093 1095-1096 1108-1112 
1114-1125 1127 1131-1133 1135- 
1138 1142-1145 1148-1158 1160- 
1163 1167 1169 1172 1175 1177 
1180 1183-11B8 1191-1195 1199- 
1200 1204 1206 1211 1213-1216 
1222-1223 1226-1227 1229-1231 
1234-1235 1241-1242 1244-1263 
1266 1269-1271 1276-1277 1279- 
1281 1284-1286 1292 1294-1295 
1299 1305-1309 1312 1314 131G- 
1319 1322 1324-1327 1330 1332 
1334-1335 1339 1344-1346 1351 
1354-1355 1357-1358 1365-1367 
1369-1370 1373-1374 1376-1379 
1381-1384 1386-1388 1392 1394 
1396-1397 1400 1403-1407 1410 
1414 1419-1420 1423 1432-1433 
1435 1437-1438 1440-1442 1446 
144B 1453-1455 1457 1461 1463- 
1464 1466 1468 1471 1477 1460 
1482-14B3 1496 1502-1504 1507- 
1509 1513 1519-1520 1524-1526 
1536 1547 1549-1552 1567 1573- 
1S74 1578 1586-1589 1597-1598 
1601-1602 1605 1607-1609 1611- 
1617 1619-1621 1623 1625-1626 
1635-1641 1643-1645 1649 1651 
1653 1656-1658 1664 1669 1671- 
1674 1676-1684 1686 1639-1690 
1694-1696 1704-1705 1708-1709 
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Tissue Origin I RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult brain 



Clontech 



1720-1724 1726-172B 1730-1733 
1737-1740 1742-1745 1753 1756- 
1757 1759-1761 1765 1767 1771- 
1772 1776-1777 1779-1780 17B6 



ABR011 



adult brain 



BioChai 



24 75 103 186 210 310-311 364"^ 
365 508 623 710 937 1002-1003 
1059 1204 1609 1731-1732 



adult brain 



Invitrogen 



ABR012 
ABR013 



46 182-184 204-205 300 739 757 
1371 1549 1620 1684 



adult brain 



IBS 204-205 364-365 393 497 595 
687 692-694 830 845 1068 1320 
1413 1640 



Invitrogen 



ABR014 



187 301 357 364-365 375 454 463' 
731 859 939 983 1073 1262 1270 
1320 1403 1640 1651 1657 1696 
1722 1738 



adult brain 



Invitrogen 



A8R015 



adult brain 
adult brain 



419 434-435 441-442 763 789 983 
1320 



Invitrogen 



ABR016 
ABT004 



312 364-365 379 1320 1334-1335 
1674 1722 1785 



Invitrpgen 



cul tured 
preadxpocytes 



Strategene 



ADP001 



14-16 22-23 25 37-39 43 SB £6 
70-72 78 86 94 107 113 116 136- 
137 143 146 152 161 173 182-184 
194 196 198 210 218 229 259 .267 
295 298 309-310 320-321 324 336- 
338 346-347 349-350 356-357 362 
371 379-380 382-383 391 393 396 
399 401 408 428 438 459 461 476 
482 490 502 507-509 516 526 531 
557 562 597 602 607-609 624 652 
655 667 669 671-672 687-689 695- 
696 710 712 715 721 732 739 743 
750 753 766 778 780-781 789 803 
814 826 830 837 841 857 869 874 
894-895 925 937 949 954-956 960- 
961 963 968-969 988-989 1000 
1005-1006 1016-1019 1021 1036- 
1037 1052 1086 1090 1109 1113 
1115 1120-1121 1123-1124 1136- 
1137 1140 1144-1147 1151 1167 
H7jp 1174 1188 1193-1194 1205 *- 
1225 1229 1231 1254 12S8 1262 
1280 1285 1309 1312 1334-1335 
1341 1343-1344 1356-1357 1370 
1378-1379 1383-1384 1403-1404 
1423 1429 1434 1442 1448 1451- 
1452 1454 1470-1472 1482 1499 
1525 1528-1529 1532 1S36 1547 
1554 1557-1559 1551-1562 1567 
1585 1588 1590 '1595 1601-1604 
1608 1610-1613 1615 1619 1624 
1627 1640 1644 1647 1660 1664 
1666 1670 1675 1696 1704 1715 
1723 1727 1738 1760-1761 1768 
1779 1785-1786 



5-8 11 17 25 68-69 80 82 87 103 
105 110 116 136-138 168 171 188- 
189 196-198 261 267 276 288 293 
301 318 331 336-338 379-380 391 
400 428 430-431 510-512 520 524 
527 549 557 561 602 618 620 622 
631 637 647 670 681-682 710 731 ' 
748 782 793-794 817 834-836 843 
845 858-859 879 882 893-895 934 
960 9B2 986 995-996 1000 1002 
1005-1007 1025 1027-1028 1032 
1039 1045 1071 107B 1097 1099- 
1102 1136-1137 1140 1219-1220 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adrenal gland 



CI on tech 



1260 

1322 

1370- 

1437 

1602 

1660 

1711 

1760- 



1271 
1329 
1371 
1466 
1606 
1662 
1719 
1761 



1297- 

1339 

1398 

1468 

1614 

1673 

1720 

1765 



1298 

1345 

1408 

1533 

1631 

1687- 

1742 

1767 



133T 
1365- 
1423 
1539 
1649- 
1688 
1746 
1771 



1320 
1366 
1431 
1594 
1650 
1696 
1749 
1785 



ADR002 



adult heart 



4~-10 15-16 2S 29-31 
51 55 60 62-63 65-6 
116 11B 122 126 130 
170 181 192 198 201 
228 247 251 255 267 
281 285 295 298 311 
349 351-352 354 372 
391 400 410 415-416 
431 434-437 439 445 
477 483 491 493 497 
519 527 535 S46 549 
581 588 595 600 602 
628-630 637 645-646 
713 715 719 732 734 
773-778 789 816 829 
B69 875 883 898 904 
930-931 942 948 952 
976-977 981 990 992 
1004 1049 1055 1059 
1076 1112-1113 1115 
1134-1135 1151 1158 
11B1 1188 1209 1218 
1227 1231 1243 1270 
1280 1285 1290 1293 
1325 1327 1330 1342- 
1348 1365-1366 1369 
1387 1398 1400 1405 
1426 1436 1440-1441 
1463-1464 1488 1491 
1538 1546 1567 1573 
1598 1609 1614 1618 
1627 1634 1636 1649 
1671 1674 1678-1679 
1703 1717 1727 1731 
1765 



43-4* 47 50- 
6 7S 80 102 

137 150 169- 
-203 215 227- 
-269 271 280- 

336-338 342 
-373 383-385 
424 426-427 
454 461 473 
-498 503 516 
552 572-573 
608-610 620 
670 679 703 
744-746 758 
837 845 84B 
912 922-923 
965 967 969 
993 1001 
1071-1072 
1121 1127 
1163 1175 
1224-1225 
1271 1274 
1307 1324- 
1343 1345 
1378-1379 
1417 1425- 
1444 1454 
1507 1512 
-1575 1588 
1622 1624 
1651 1658 
1691-1692 , 
1732 1737 



GIBCO 



AHR001 



4-8 10-11 1 
46 50-52 57 
B5 87 89 94 
110 112 114 
127 130-132 
147-151 153 
186 192 195 
215 220 225 
236 251 257 
277 280-282 
298-301 304 
325 330 333 
352 354 358 
384 387-398 
408-409 411 
433-439 445 
457 459 462 
483-484 487 
503 506 508 
526 534 536 
560-562 574 
587 589 593 
612 615-620 
645-652 656 
674-675 683 
701 709 712 



5-16 18-21 34-39 
58 60 62-63 71 
97 100 103-104 
116 118-119 122 
134 136-138 141 
163-164 168-171 
197 199 204-205 
226 229-230 232 
260 262 265 272 
285-286 289-292 
307 309 314 321 
336-338 345 349 
361 368 370 380 
391 393 397 401 
-412 414-416 430- 
-446 449 452 

469 472-473 476 
-490 492-493 496 
510-513 516 519 
-540 542 546 
577 581-582 
595 597 604-609 
622-623 626 632 
660 665-666 670 
684 687 692-694 
715-716 719-720 



549 
584 



44- 
75 82 
108- 
123 
144 
179 
212- 
234- 
274 
296 
324- 
351- 
383- 
406 
431 
-455 
-480 
498 
522 
553 
586- 
611- 
63 7 
672 
697 
725- 
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Tissue Origin 


RNA Source 


Hyseq 




SEQ 


ID NOS; 








Library Name 


















726 


728 730-732 735 


738- 


739 743- " 








744 


746 751 753 759 


761 


755 770- 








771 


775-780 78S 788 


-790 


796 802 








804 


810 812 617 821 


826 


828 830 








637 


643 845-847 849 


-853 


857-861 








863- 


864 869 871 875 


877- 


879 881 








883 


887 890-892 894 


-895 


897-898 








901 


903 906-907 911 


-913 


915 919 








921- 


925 927-92B 933 


-935 


945 958 








961- 


963 967 969-972 


975 


977-978 








900- 


986 990 992 999 


-1002 


1005- 








1007 


1010 1016 


1019 


-1020 


1022- ' • 








1023 


1025 1028-1037 


1039 


-1040 








1043 


1047 1050 


1054 


-1055 


1057 








1059 


1063-1064 


1067 


-1068 


1070 








1072 


1075-1076 


1083 


1085 


-1087 








1089 


1093-1094 


1104 


1106 


1108- 








1109 


1113 1116 


-1117 


1119 


1121 








1124 


1126 1128 


1131 


-1134 


1144- 








1145 


1148-1149 


1151 


1158 


1167 








1169 


-1170 1175 


1177 


1192 


1196 








1199 


-1200 1202 


1206 


-120B 


1211 








1216 


1218 1222 


1227 


-1229 


1232- 








1235 


1238-1241 


1243 


-1244 


1247- 








1248 


1250 1253-1254 


1256 


-1258 








1261 


1268 1270-1271 


1277 


1280- 








1282 


1287 1292 


1298 


-1299 


1306 








1308 


1317-1321 


1324-1325 


1330 








1332 


1334-1337 


1339 


1344 


-1345 








1349-1350 1354- 


•1356 


1359 


-1360 








1365-1366 1369 


1371 


1374 


-1375 








1378-1380 1383- 


1384 


1389 


1397 








1400 


1403 1409 


1417 


1423 


-1426 








1437 


1439 1442 


1444 


1446 


-1447 








1450 


1453 1468 


1470 


1473 


1479 








1481 


1488 1490 


1501- 


-1504 


1519 








1521 


1524 1528 


1^30- 


•1534 


1536- 








1537 


1539 1541- 


1542 


1547 


1553 








1555 


1560* 1565 


1567-1571 


1588 








1591 


1597-1598 


1601-1602 


1605 








1614-1616 1619- 


1620 


1623-1628 








*1630-1632 1634 


1636 


1641 


1644- 








1645 


1647 1649 


1652- 


1655 


1659 








1662 


1667 1673- 


1674 


1680- 


-1681 








1684 


1686-1688 


1704- 


1705 


1709 








1711- 


1712 1717 


1724 


1726- 


■1727 








1731- 


1733 1737- 


1738 


1741 


1743- 








1744 


1749 1754- 


1755 


1760-1761 








1765 


1772 1785 








adult kidney 


GIBCO ™ 


AKD001 " 


4-8 2 


0-11 17-21 


29-31 35- 


-39 42- 








45 50-51 56-58 


60-61 


64 68-69 75 








77 80 


82 35 87 


92-94 


97 100 102- 








104 107-108 112 


116- 


117 119 123 








127-133 136-137 


139- 


141 143-144 . 








147-154 157 161 


-163 


165-166 169 








172 176 178-179 


192 


194-197 199 








201 203-206 209 


-210 


212-213 215- 








216 223-228 234 


-236 


238 247 251- 








2S3 257-259 261 


-262 


265-269 271- 








272 274 276-277 


279- 


281 284-286 








293 293 296 298 


-299 


301-302 304 








307 311-313 321 


325- 


326 329-331 








333 341 344 348 


-350 


352 356 350- 








359 362 364-365 


36B 


370-372 374 








376-377 380-382 


392 


395 398 400- 








401 4 


04 407-409 


414- 


415 423-424 








430-437 443-444 


446 


449 451 453- 








4S5 459 461-462 


464 


467 469 471- 








474 476-477 480-481 


483 487-468 
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Ti 



Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



490-431 493 497-505 
520 522 524 526-529 
544 547 549 554-556 
557 571-576 578 582 
593 598-599 601 604 
615-619 621-626 632 
645-652 655 660-664 
678-679 688 652-695 
713 717 719-720 727 
738 743 745-746 751 
763 765 771-773 775 
788 793 795-796 800 
810-812 814-819 821 
634-838 842-645 848 
864-865 867 869 871 
8B6-887 889-891 893 
902 906-908 910-914 
925-927 929-935 937 
948-949 951 953-958 
964 969-970 972 976 
988-990 992-993 995 
1004-1008 1010 1012 
1017 1019-1020 1022 
1035 1038-1040 1042 
1050 1054-1055 1057 
1070-1073 1078 10B5 
1089 1092 1094 1097 
1107 1109-1112 1116 
1123-1125 1132-1135 
1143 1146-1147 1149 
1154 1157 1159 1163 
1178-1179 1181 1183 
1200 1202-1204 1206 
1219 1221-1222 1225 
1232-1234 1238-1241 
1246-1247 1253 1257 
1261 1267-1268 1270 
1281 1283 1287-1289 
1299 1306 1308 1311 
1320 1323 1329-1330 
1339 1341 1349-1350 
1359 1367 1369 1373 
1379 1394 1397 1400 
1407-1409 1417 1419 
1428-1431 1433 1437 
1443 1445-1446 1448 
1454 1459 1461 1465 
1475 1478.1484-1488 
1493 1495 1497-1498 
1509 1512 1518 1521- 
1527-1528 1532-1533 
1541 1547-15S0 1552 
1551 1565-1566 1568 
1578-1579 1583 1586- 
1591-1592 1594 1598 
1604 1606 1608 1611 
1616 1618-1622 1624- 
1632 1634-1636 1638- 
1644 1646-1649 1553- 
1664 1666-1667 1670- 
1679 1683-1684 16B6 
1696-1699 1701 1709- 
1714 1716-1719 1723- 
1727 1733 1737-1738 
1744 1748-1749 1751 
1763-1768 1778 1780 



510-513 516- 
534 537-540 
560 562 564 
586-589 592- 
606 6CB-613 
634 637-643 
669-672 676 
698 702 711 
731 735-736 
753 755 762- 
-776 780 786 
803 805 808 
826 829 832 
-855 857-861 
874 876-8B3 
-896 898-900 
918 920 922 
940-942 945 
960-961 963- 
■978 982-986 
•997 999-1002 
•1013 1016- 
1025-1031 
1044 1047 
■1064 1068 
•1086 1088- 
1099-1102 
•1119 1121 
1140 1142- 
•1150 1153- 
1167 1170 
1192 1196- 
•1211 1216- 
1227-1230 
1243-1244 
•1258 1260- 
1272-1274 
1293-1295 
•1313 1317- 
1334-1335 
1353-1357 
1375 1378- 
1403 1405 
1423-1424 
1438 1442- 
1450 1453- 
1468 1474- 
1490 1492- 
1506-1507 
1522 1525 
1537 1540- 
1556-1559 
1571 1575 
1587 1589 
1600 1603- 
1613 1615- 
1628 1631- 
1639 1641 
1656 1662 
1671 1676- 
1691-1692 
1711 1713- 
1724 1726- 
1741 1743- 
1760-1761 
1785 



adult kidney 



Invitrogen 



? KT i02 | 20-21 37-39 47 52 57 60 65-66 

68-69 80 104 107-108 122 130 133 
136-137 140 142-143 149 169 174 
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Hyseq 
Library Name 



Tissue Origin 



RNA Source 



SEQ ID NOS: 



adult lung 



181 197 227-228 235-236 244 251" 
261-265 267 280-281 286 290 299 
301 304-305 309 312-313 339 341 
344-345 349 358 370-372 376 382- 
383 3Q7 392 401 414 416 421 430 
443 445 449 453-454 472 437-488 
504 S06 513 516 519 522 520 536- 
540 546 554 S85 587 594 598 602 
607 616-617 626-627 636 643 662- 
664 695 709 721 735 743 761 768 
775-777 788 796 804 814 827 837- 
838 849-850 852-853 B69-870 881 
890-892 898 903 905-907 914 919 
925 927 934 941 949 952 957 960 
962 968 970 1000 1008 1029-1030 
1044 1052 1055 1063 1067-1068 
1073 1085 1099-1102 1107 1110- 
1111 1113 1115 1119 1126 1134 
1136-1137 1146-1148 1153 1159 
1192 1196 1199 1232-1233 1241 
1256 1264 1272-1273 1281 1285 
1293-1294 1299 1312 1320 1324- 
1325 1330 1344 1349 1351 1355- 
1356 1369 1378-1379 1403 
1419 1428-1429 1436 1446 
1463-1464 1467-1468 1470 1477- 
1478 1486 1491 1509 1519 1527 
1529 1534 1547 1596 1600 1619 
1623 1629 1631 1634 1638 1643 
1647 1652 1660 1664 1667 1669- 
1670 1673 1686 1709 1727 1740 
1776 



1414 
1458 



GIBCO 



ALG001 



lymph node 



4-8 14 37-39 44-46 
63 75 82 88 93 103- 
133 140 143 150 152 
171-172 174-175 190 
211 214 219 223-224 
252 256 265 272 274 
310 332 345 351 362 
394 408-409 431 436 
461 467 469 471 476 
513 527 537-540 544 
564 583 607 616-617 
634 645-646 662-664 
719 743-744 763 766 
Bll 814 817 831-832 
852-853 858-859 861 
901 905 941 954-957 
979 981 987 990 992 
1005-1006 1014 1017 
1054 1059 1062 1064 
1086-1089 1094 1107 
1136-1137 1142 1150 
1190 1200 1208 1220 
1273 1280 1282 1295 
1331-1332 1353 1374 
1384 1404 1409 1423 
1442 1474 1478 1494 
1525 1531-1532 1547 
1554 1571 159B 1606 
1527-1629 1632 1642 
1569 1676-1677 1684 
17311732 1737-1738 
1786 



50-S1 56 62-" 
104 113 125 

154 157 162 
-191 196 200 
227-228 251- 
280-281 285 
371 381-382 
445 454 459 
•477 489 504 
547-548 554 
621 623-624 
670 695 716 
774 789 803 
837-838 845 
866 880 887 
966 971 977 
996 1001 
1045 1047 
1072 1080 
1126 1134 
1157 1173 
1241 1272- 
1306 1320 
1379 1383- 
1434 1436 
1509 1S22 
1549 1553- 
1613 1624 
1644 1662 
1696 1727 
1748-1749 



Clontech 



ALN001 



24 50-51 82 105 137 153 198 
201 223-224 234 268-269 272 280- 
281 267 301 312 329 343 382 421 
430 433 44S 451 461-462 475 481- 
482 503 526 529 537-540 546-547 
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RNA Source 



Tissue Origin 



Hyseq 
Library Name 



SEQ ID NOS: 



young liver 



521 626 649 679 719 
793 803 831 834-836 
858 866 879 905 913 
1005-1006 1012 1038 
1117 1151 1199 1204 
1265 1274 1324-1325 
1374 1377 1440-1441 
1549 1600 1618-1619 
1644 1653 1687-1688 
1741 1771 



725-726 738" 
838 844 857- 
92B 963 976 
1050 1116- 
1226 1243 
1339 1353 
1447 1504 
1631 1641 
1691-1692 



GIBCO 



ALV001 



adult liver" 



invitrogen 



ALV002 



5-8 11 20-21 46 50-51 58 65-66" 
75 79 82 93 97 102-103 108 110 
116 139 143-144 148-149 171-172 
174 187-189 194-195 198 209 214- 
215 230 250 258 267-269 280-281 
306 309 342 351 356 359 362 372 
374 392 394 398 401 407-408 410 
414 431 444 455 459 476 470 483 
493 510-512 516 520 522 526 536 
549 571 574-577 585 592 601-602 
607 621-624 628-630 632-633 637 
648 660 666-667 678 697-698 700 
717 719 728 730 734 738 744-745 
766 770 773 779 788 800 808 812 
814 841 849-851 871 874 879 887 
893 898-900 902-904 906-907 911 
919 922 924 934 953 957 963 965 
970 984 986 997 1001 1004 1007 
1012 1029-1030 1033-1034 1052 
1061 1066 1070 1076 1086 1089 
1093 1099-1102 1110-1112 1116- 
1117 1119 1121 1125 1136-1137 
1144-114S 1156-1157 11S9 1196 
1199-1200 1209 1211 1219-1220 
1241 1244 1262 1270 1275 1279 
1283 1295 1317-1320 1332 1339 
1344 1359 1362-1363 1379 1383- 
1384 1403 1415 1430-1431 1437 
1450 1467 1475-1476 1483-1484 
1494-1495 1498 1505 1512 1516 
1518-1519 1526 1529 1547 1550-. 
1S52 1557-1559 1565 1583 1587 
1597 1609 1614 1620 1631 1637 
1641 1644 1654-1655 1652 1667 
1669 1684 1691-1692 1702 1711 
1725 1738 1741 1743-1744 1758 
1760-1761 1763-1765 1769 



5-8 17 20-21 32-33 41 55 58 64 
75 77 86 89 102 108 117 119 175- 
176 198 200 209 231 235-236 250 
272 275-276 284 306 316 321 325 
333 356 359 374 376 398 401 408 
414 428 430 433-435 454 476 494 
503-505 517-518 528 534 544 552 
561-563 567 57B 581 608-609 630 
632 637 644 650 661 665 672 702 
707 710 721-722 750 753 778 782 
794 814 820 826 834-837 847 849- 
650 858 861 874 879 893 B98 904 
911 918 921-922 926 946 948 972 
978 986 996 1020 1027 1031 1034 
1053 1063 1068 1070 1073 1086 
1089 1093 1097 1113 1119 1156 
11S9 11S5 1198-1199 1208 1220 
1227 1241 1261 1272-1273 1277 
1285 1308 1315 1320 1324-1325 
1330 1362-1363 1375 1403 1408- 
1409 1415 1431-1432 1435 1467 
1469 14B2 1504 1524 1542 1547 
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Tissue Origin 



Hyeaq 
Library Name 



RNA Source 



SEQ ID NOS: 



1550 

1597' 

1618- 

1647 

1669- 

1738 

1765 



1601- 
1619 
1652 
1671 
1742- 
1772 



1578 1581 
1602 1611- 
1621 1625 
1654-1655 
1684 1706 
1744 1760- 
1774 



1583 1594 
1612 1615 
1637 1645 
1660 1666 
1722 1737- 
1761 1753- 



adult liver 



Clontech 



29 676 997 1063 1119 1536 176^ 



adult ovary 



ALV003 



Invitrogen 



AOV001 



1 4-18 20-23 29 35-40 42-48 50- 
51 53-58 61-63 65-66 68-69 73-75 
77-78 80 82 85 87 89 97 100-101 
103-104 106-108 110 113 115 118 
122-124 126 128 133-134 136-140 
142 145-147 149-157 161 166 168- 
170 174 177-173 180 182-186 188- 
189 192-203 207 209 211-215 219 
221-224 229-230 234 242-243 246- 
247 255 258 260-262 265-269 271- 
272 274 277-281 284-286 288 290 
295 299 301-302 304 307 309-311 
313-314 316 321 323-326 330 332- 
333 335-338 341 344 349 352-353 
356 358 360 362 370-372 376-377 
379-384 387 390-392 394 397-398 
400 403 408-410 412 414-416 423- 
424 426-427 430-435 439 443-446 
448-449 451 453-455 462-463 468- 
471 473 476-479 481-484 487 489- 
494 496-497 499-501 503-505 509- 
514 516-517 519-520 522 524 526 
528-534 541-544 546-547 549 552 
554-555 561-564 566-567 569-570 
572-573 575-S7G 579 581 503 585- 
588 S90-S91 593 595 597 599 601- 
60S 607-613 615 618-622 624-627 
630 632-633 636-640 642 644-647 
649-652 654-655 657-665 667-675 
677-678 681 683-684 692-695 697- 
710 714-721 723 725-727 729 732 
734-735 743-746 750-751 753 758 
763 765 767 772-773 775-778 780 
783-784 786 78e 790-791 794-796 * 
800 803 805 809-811 813-815 818- 
819 821-824 826 828-829 831-832 
837-838 843-850 852-857 859-864 
867 869 871-872 874-875 878-883 
887-888 890-895 898-910 912-914 
916 919-922 924 926-927 929-939 
941 943-946 948-951 953 955-958 
961-964 966-967 970-979 981-982 
985-986 988-990 992 995-997 999- 
1001 1004-1009 1011-1013 1016 
1019-1020 1024-1025 1029-1031 
1033-1035 1037 1039 1041-1047 
1050-1051 1054-1060 1062-1064 
1067-1070 1072-1073 1075-1076 
1078-1079 1085-1086 1089-1090 
1094-1096 1098-1103 11C6-1108 
1112-1117 1119-1120 1123-1127 
1131-1135 1142-1143 1146-1149 
1153 1156 1158 1163 1165-1166 
1169-1171 1173-1175 1177-1178 
1180 1183-1185 1190-11S1 119S 
1197-1200 1202 1205-1214 1217- 
1219 1221-1226 1232-1235 1238- 
1241 1243-1244 1247 1249 1252- 
1254 1256-1258 1262 1265 1267- 
-1268 1270 1275 1278 1280-1283 
1286-1289 1291 1293-1294 129B- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1299 
1323 
1338 
1359 
1377 
1394 
1427 
1443 
1463 
1481 
1494 
1507 
1526 
1536 
1553 
1567 
1578 
1S91 
1609 
1636 
1657 
1671 
1690 
1713 
1726 
1738 
1751 
1765 
1778 



1306 
1327 

•1339 
1361 

■1375 
1400 
1429- 
1445- 

•1464 
1484- 
1496- 
1511- 

-1527 

-1539 
1555- 
1569- 
1580- 
1595 
1611- 
1636 
1659- 
1673- 
1699 

-1714 

■1728 
1740- 
17S3 
1767- 

-1779 



1306 

1329- 

1341 

1365- 

1383- 

1404 

1431 

1450 

1466 

1485 

1498 

1517 

1530- 

1541 

1559 

1570 

1581 

1597- 

1621 

1641 

1662 

1674 

1702- 

1716- 

1731- 

1741 

1755- 

1768 

1783- 



1317. 
1330 
1343- 
1366 
1384 
1416- 
1435- 
1453- 
1468 
1488 
1501- 
1519 
1531 
1546 
1561- 
1572 
1587- 
1598 
1623- 
1643 
1664 
1676- 
1707 
1719 
1733 
1743- 
1756 
1770- 
1784 



1317 
1332 
1351 
1371 
1386 
1417 
1436 
1454 
1470 
1491 
1504 
1521 
1534 
154 6 
1563 
1574 
158B 
1600 
1630 
1645 
1667 
1681 
1710 
1723 
1735 
1744 
1760 
1771 
1786 



1321 
1333 
1356 
■1375 
1389 
1422- 
1439- 
1459 
1474- 
1493- 
1506- 
1524 
•1536 
•1550 
1566- 
•1575 
1590- 
■1606 
1634 
1647- 
1669- 
1683- 
■1711 
1724 
1737- 
1748- 
1762 
1776 



adult placenta 



"Clontech - 



APL001 



5-8 44-45 90-91 107-108 159 178 
311 351 414 476 503 545 574 624 
636 719 755 773 860 B90-891 924 
947 955-956 962 990 992 1002 
1045 1202 1320 1369 1628 1686 
1713-1714 1743-1744 



60-61 79-80 103 
177 180 194 196 
236 272 290 299 
359 379-380 417 
448 454 483 490- 
723 72^-726 728 
843 854-855 857- 
954 976 588-989 
1033 1036 1064 
1139 1144-1145 
1317-1320 1343 
1438 1454 1482 
1519 1532 1549 
1626 1647 1649 
1722 1727 1730 



placenta 



Invitrogen 



APL002 



14-16 26 29 43 
106 116 135 171 
198 210 216 235 
309 329 334 339 
423 430 434-435 
491 517 522 631 
738 746 769 818 
858 916 948 953 
1005-1006 1013 
1068 1070 1086 
1160 1277 128S 
1345 1429 1435 
1486 1490 1512 
1592-1593 1602 
1664 1673 1675 
1746 1776 



adult spleen 



GIBCO 



ASP0O1 



3 5-8 12 15- 
44-45 57 60 
103 106 108 
147 152-153 
17B-160 196 
215 219 234 
272 280-281 
325 333 341 
387 394 406 
448 451 473 
505 517 519 
554 557 574- 
611-612 620- 
652 659 661 
700 721 728 
746 762 765 
810-811 817 
852-853 858 



16 19-21 24 
82-83 87 89 
117 119-121 
155 166 169 
198 201-206 
253-254 256 
290 295 302 
349 358 372 
414 431 434- 
481 490-493 
530 534 S36- 
S76 582 592 
621 623 631- 
667 671 673- 
730 732 738 
774 780 788- 
822 630 832 
862 866 874 



29 34-36 
94 98-99 
139 141 
171 174 
209-211 
25B 264 
309 312 
382 386- 
436 446 
500 503 
540 547 
595 604 
632 642 
675 684 
742-744 
789 794 
845 848 
879 882 
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Tisoue Origan 



RNA Source 



Hyaeq 
Library Name 



SEQ ID NOS: 



884 906-908 912 919 
927 934 947- 949 957- 
978 983 990 992-994 
1005-1007 1010 1012 
1042-1044 1046 1049 
1070 1076 1089-1090 
1109 1113 1115 1124 
1170 1174 1177 1190 
1220 1226-1227 1229 
1246 1258 1269 1271 
1301 1320 1322 1330 
1339 1349 1351 1353 
1364 1369 1374 13B6 
1417 1434 1436-1437 
1474 1477 1480 1485 
1512 1522 1525 1544 
1560 1567 1591 1600 
1651 1654-1655 1658 
1674 1678-1679 1684 
1727 1733 1738 1740 
1761 1774 1779 1781 



921-923 926- 
958 963 977- 
996-997 999 
1031 1036 
1059 1068 
1094 1103 
1140 1163 
1196 1219- 
1236 1241 
1274 1295 
1334-1335 
1359-1360 
1397 1413 
1439 1468 
-1487 1498 
•1549 1553 
1631 1636 
1662 1670 
1686 1700 
•1741 1760- 
•1782 



testis 



GIBCO 



ATS001 



5-8 10 26 30-31 47 50-51 57 68- 
69 82 84-85 97 102 113 119 137 
139 150 152 154 156 163 169 174 
176-177 192 194 196-197 212-215 
227-228 247 255 258 261 282 285 
288-289 301 307 311 316 330 334 
349 370-372 392 398 410 415 426- 
427 430-431 433 437 446 454 461 
469 473 477 481-482 493 499 502- 
503 513 522 526 547 552-5S3 563- 
564 572-573 575-576 581-582 585 
599-602 605 612 615-617 620 631 
637 647 649-650 656 660 665 670 
674-675 712 719-721 723 728 731 
738 744 746 773 780 7B4 788-789 
B02 804 809 811 814 826 831 837 
843 845 848 859 866 869 877 905 
913 916 919 921 926 929 937 950 
960 963 971 975 977 981 990 992- 
993 1007 1016 1029-1030 r 1034- 
1035 1038-1039 1045 1059-1060 
1064 1070 1072-1073 1087 1089 
1097 1099-1102 1104 1108 1113 
1141 1149 1161-1162 1175 1208- 
1209 1222 1227 1229 1231 1235 
1238-1239 1243 1253 1285 1287- 
1289 1291-1293 1307 1311 1317- 
1320 1330 1332 1338 1345 1369 
1373-1374 1379 1389 1399-1400 
1409 1423-1424 1430 1435-1437 
1443 1459 1464 1486 1490 1493 
1496-1497 1501 1505 1509-1513 
1527 1530-1531 1533 1537 1546 
1549 1563 1565 1567 1569 1571 
1577 1586 1591 1599 1602 1625 
1628 1630-1632 1636 1639 1642 
1649 1661-1662 1666-1667 1670 
1675*1684 1690 1699 1705 1712 
1717 1724 1730 1737-1738 1752 

1767 1779 

686 1352 1412" 



Genomic dna 
£rom BAC 63118 



Research 
Genetics 
(CITB BAC 
Library) 



BACOOl" 



Genomic DNA 
from BAC 39316 



Research 
Genetics 
(CITB BAC 
Library) 



BAC002 



1411-1412 
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Tissue Origin 



RNA Source 



Hysoq 
Library Name 



SEQ ID NOS: 



Genomic DNA 
from BAC 39316 



adult bladder 



Research 
Genetics 

{CITB BAC 
Library) 
Invitrogen 



BAC003 



1352 



BLD001 



Clontech 



S-8 17-18 22-23 33 
80 93 100 120-121 1 
251-252 272 278 311 
413 415 424 430 443 
543 562 564 607 616 
652 667 671 710 727 
773 786 788 837 840 
909 918 929 966 977 
1025 1055*1073 1082 
1185 1189 1199 1270 
1536 1S60 1573 1596 
1637 1649-1550 1654 
1669 1671 1690 1719 
1732 1739 1741 1760 



37-39 56-57 
69 201 237 
34B 363 382 
483 502 542- 
-617 626 635 
755-756 762 
866 893 098 
983 1016 
1140 1167 
1369 1481 
1614 1636- 
-1655 1658 

1727 1731- 
-1761 1779 



bone marrow 



BMD001 



3-8 11 13 18 29-31 33 35-36 40 
43-45 47-48 50-51 57 60 65-66 75 
80 82 85 88-89 94 100 103 107 
110 115 118-119 124-125 133-134 
136-137 139-141 146 150 152-153 
155 161 163 168-170 172 178-180 
187 192-193 197-198 203-205 210- 
213 215 217 219 222 224-226 233 
23S-237 242-244 255 258 260 263- 
264 266 273 276 278 283 286 290 
295 301-302 307 312-313 321 330 
333 339 343 352 357-358 370-371 
382 384-385 387 389 394 408 410 
412 416 421 424-427 429-431 436- 
437 439 441-442 445 447 454-456 
461-462 471-472 475 477-479 481- 
482 485 488 493 498 500 503-506 
513 516 519 523-524 526 530 535- 
540 542 544-545 549 555 565 567 
569-577 581 583-586 588 593 601 
603-604 608-609 613-619 621-622 
632-633 636-637 642 649-650 656- 
660 666 670 672 674-675 6J79 683 
701 708 716 718-720 731 73S-736 
740-742 744-745 752 761 765 772- 
773 775-778 780 785-786 789-791 
796 798 802 810-812 823-824 826 
830 832-833 837-838 843-844 848* 
855 858-859 866-867 869 878-880 
883 890-892 896 903 905 908 912- 
914 922-924 927 930-931 937 939- 
941 952-953 955-958 963 969 973 
976 981 985 537 990 $92 995 1000 
1002 1005-1007 1013 1016 1025 
1028-1031 1033 1035 1037 1039 
1042 1044 1047 1050 1053-1054 
1059 1061 1063 1066 1070-1071 
1079 1106 1110-1113 1115-1117 
1124 1126 1134-1135 1142 1144- 
1145 1163 1172 1178 1197 1199- 
1200 1202 1216-1217 1224 1227- 
1228 1240 1246 1254 1261 1266 
1270 1278 1281 1295 1287 1290- 
1291 1293 1299-1301 1308 1314 
1317-1320 1327 1331 1339 1343 
1346 1349 1353 1356 1361 1367 
1369 1372-1374 1379-1380 1394 
1400 1403 1406 1408 1413 1417 
1419 1423 1425-1427 1430-1431 
1433 1439 1443 1446-1449 1459 
1463-1464 1482 1486 1493-1494 
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Tissue Origin 



RKA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



bone 



1506 
1526 
1546 
1557 
1592 
1626 
1638 
1653 
1684 
1713 
1727 
1772 



1509 
152B 
154B- 
•1559 
1597- 
•1628 
■1639 
•165S 
1686 
•1714 
1737- 
1781- 



1513 
1531 
1549 
1S71 
1600 
1630 
1641 
1661 
1690 
1717 
1738 
1702 



Clontech 



BMD002 



"152T: 

1536- 
1552 

•1572 
1609 

•1632 
1646- 

■1662 
1702 
1720 
1740 
1785- 



1522 

1537 

1554- 

1581 

1614 

1634 

1647 

1676- 

1707 

1722- 

1758 

1786 



1524 

1543 

1555 

1589- 

1621 

1636 

1651 

1681 

1711 

1723 

1767 



11 15-16 19 30-31 35-36 68-69 75 
83-84 93 99 103 108-109 118 137 
139 169-170 174 177 180 190 193 
212-213 219 222 225-226 232 237 
255 259 264 273-274 284 286 290- 
292 295 301 303-304 307 312-313 
316 324 326 330 334-335 348 352- 
353 357 360 370-373 384 386-387 
397 403-404 414-416 421 425-427 
429-430 433-436 440 444 451 454 
465-466 472 475 478 491 493 516 
520 523 525 531 545 548 552 566 
569-570 581 583 590-591 597-598 
601 616-617 621 641 650 652 656 
659 671 674-675 679 684 710 718- 
719 728 734 737-738 742 761 765 
774-778 790 811 814 818 B30 834- 
836 854-855 859 866 869 871 878- 
879 884 889 892 904 922-923 932 
990 992 998 1001 1004 1016 1036 
1042 1048 1051 1054-1055 1058 
1088-1089 1106 1112-1114 1155 
1157 1192 1200 1223 1227-1228 
1236-1237 1260-1261 1282-1283 
1285 1287 1295 1314 1317-1321 
1324-1327 1330 1333 1341 1343 
1347 1350 1353 1355-13S7 1367 
1369-1370 1373 1377 1379 1381 
1383-1384 1394 1397 1400 1406 
1413 1417 1425-1427 1438 1442 
1446 1459-1460 1470 1493 1505 
1521 1536 1546-1549 1550 1573- 
1574- 1578 1598-1600 1621 1626 
1631 1634 1646 1649 1653 1656 
1658 1669-1670 1683-1684 16B7- 
1688 1G9Q-1693 1696 1699 1702 
1704 1707-1709 1711 1720 1722- 
1723 1725 1727 1729 1731-1733 
1738-1740 1743-1746 1752 1755 
1760-1761 1767 1777 1781-1782 
1786 



bone marrow 



Clontech 



BMD004 



73-74 503 922 1036 1711 
95-96 866 1320 1475 



bone marrow 
adult colon 



Clontech 
Invitrogen 



BMD007 



CLN001 



17 56-58 103 110 117 144 150 171 
179 185 18B-189 201 204-206 210 
218-221 225-226 231 237 251 277 
288 310 312 320 333 359 386 3B8 
394 408 420 455 401 485 503 510- 
512 590-591 615 635 647-648 665 



672 



780 



684 697 710 725-726 743 
786 788 826-827 848-850 854-855 
858 866 872 898 918 921-923 953 
976 983 993 1005-1006 1017 1020 
1025 1027 1054-1055 1063 1068- 
1069 1140 1153 1170 1185 1196 
1199 1220 1280 1314-1315 1320 
1345 1351 1355 1369 1428 1439 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



Mixture of 16 
tissues - 
mRNAs 



1462-1464 1512 15S6 1*83 1587"" 

1594 1596 1614 1625-1626 1631 

1639 1645 1650 1675-1677 1687- 

1688 1701 1713-1714 1724 1740 
1765 



Various 
Vendors 



CTL016 



401 1490 1686 



Mixture of 16 
tissues - 
roRNAs* 
adult cervix 



Various 
Vendors 



CTL021 



312 7B2 1132-1133 1403 1712 1715 



BioChain 



cvxobi 



1 4-8 11 13 18-21 25-26 30-31 33 
37-39 43 46-47 58 61 64-66 71 
73-74 82 85 94 100 103-104 113 
118 122 126 130 134 140 147 153- 
156 163 170 179 181 186 192 195- 
196 198 201-202 218-219 222 229- 
231 257 266 276-277 285-286 288 
290 301-302 304 307 312-314 324 
326 329-330 332 335 342 352 358 
362* 371-372 376 379 381-382 384 
388 398 400 410 414 416 419-420 
426-427 430-431 433-436 439 446 
448 461-462 464 471-477 479 482- 
433 491 493 496 503 506 510-513 
516-517 526 530 535 542-544 546- 
547 5S7 S61 572-573 575-577 581- 
582 585-S86 588-589 593-594 600 
602 604-605 607-609 612 615-619 
623 644 650 654 657-658 662-665 
670 672 680 683 691-694 698 706 
708-709 711 713 720-721 727 729 
731-732 737 745-747 753-754 760 
765 771 774-777 780 790 793 796 
798 800 803 805 818 826 828 B31- 
832 834-836 843 B47-848 851-855 
857-860 864-866 869 871 876 878- 
B80 882 887 890-891 897 899-902 
905-908 912-913 916 918-919 922 
927 932 934-938 944 948 9S5-956 
958 963-964 967 569-970 972 976 
978-979 983 985 990 992 1000 
1005-1007 1016-1017 1024 1027 
1033 1036 103B 1045 1047 1053- 
1056 1066-1067 1071 1073 1075 
1079 1082 1098 1113 1124 1129 
1134 1139 1146-1149 1163 1167 
1170 1173 1175 1177 1181 1197 
1200 1202 1211 1214 1216 1221- 
1222 1225 1227 1232-1234 1240- 
1241 1243 1258 1264-1265 1268 
1270 1279 1287-1290 1308 1310- 
1311 1316 1320 1323 1327 1345 
1349 1353-1354 1360 1372-1374 
1383-1384 1386 1394 1397 1405- 



The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain 
mRNA (Invitrogen), 2) normal aduJt kidney mRNA (Inviirogen), 3) normal adult liver 
mRNA (Invitrogen), 4) normal fetal brain mRNA (Invitrogen), 5) normal fetal kidney 
mRNA (Invitrogen). 6) normal fetal liver mKNA (Invitrogen), 7) normal fetal skin mRNA 
(Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA 
(Clontech), 10) human leukemia JymphaWastic mRNA (Clontech), 31) human thymus 
mRNA (Clontech), 12) human lymph node mRNA (Clontech), 13) human spinal cord 
mRNA (Clontech). 14) human thyroid mRNA (Clontech), 15) human esophagus mRNA 
(BioChain), 16) human conceptional umbilical cord mRNA (BioChain). 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ 


ID NOS: 








Library Name 




















1406 14l£ 1*25- 


•1427 1431 1436- 








1437 1442 1446 


1448 1453 1459 








1466 1472 1478 


1482 1496 1501- 








1503 1506 1512 


1522 1527-1528 








1S3 


1 1533 1541 


1547 1569 1571 








1585 1589 1597- 


1598 1600 160B- 








1609 1614-1616 


1620 1623-1624 








1626-1628 1630 


1638 1641 1643 








1649 1653 16S6 


1662 


1667 1669 








1674-1675 1683 


1685-1688 1699 








1702 1709-1710 


1715 1717 1722 








1724 1729 1731- 


1732 


1735-1739 








1741 1743-1744 


1748 


-1749 1755 








1760-1762 1767 


1773 


1778 1785- 








1786 










diaphragm 


BioChain 


DIA002 


13 7 


282 


289 730 


780 


986 


1409 








1478 1599 1614 








endothelial 


Strategene 


EDT001 


3 5 


-10 


13 15-21 


24- 


26 29 34 37- 


cells 






39 


42 44-45 50- 


51 53-55 


57-58 








60- 


61 65-66 58- 


69 73-74 


77-78 80 








82- 


83 85 87 89 


93-96 101-105 108 








110 


112 


-114 116 


118 


-122 


124 128 








133 


-134 


137-142 


147 


-150 


152-153 








161 


-163 


166-172 


176 


-179 


187 190 








1S2 


194 


196-201 


204 


-207 


210 212- 








214 


220 


224 229 


-230 


233 


235-236 








240 


-241 


251-252 


258 


261-262 265 


* 






267 


-269 


272 2/6 


-277 


279-281 284- 








285 


288 


290 295 


-296 


301-302 310- 








311 


313 


316 321 


325 


329 


331-333 








335 


340 


•342 351 


-355 


360 


371 375 








380- 


-382 


384 387 


390 


392 


397 400 








407- 


-408 


410 412 


414 


416 


425-427 








431 


434 


-436 439 


444 


-445 


449 454 








463- 


-464 


472-475 


477 


-479 


486 488- 








490 


497 


-498 5D0 


-504 


510- 


-513 516- 








S19 


522 


524 526-528 


532- 


•534 536- 








540 


S42-546 548 


561-563 


566-567 








572-576 


579 581 


585-586 


589 593 








595 


597 


599 603 


607 


•612 


615-617 








620 


622 


626 £30 


632- 


•634 


638-641 








644 


647 


656-660 


662- 


-664 


670 673 








678 


680- 


•682 692- 


•697 


707 


709-710 








712- 


713 


719 730 


732 


734 


736 738 








743- 


746 


751 759 


768 


771 


773 775- 








778- 


783 


786-789 


793 


800 


803 805- 








807 


810- 


811 814 


816- 


818 


821-822 








824 


826 


828-629 


832 


834- 


838 842- 








845 


848- 


850 854-860 


862 


864 869 








871 


874 


876-879 


883 


885 


887 890- 








891 


894- 


895 898- 


900 


903 


908 910- 








913 


916 


919-922 


924 


926- 


928 930- 








93S 


939 


943 948- 


94 9 


951- 


954 957 








959- 


961 


964 969- 


970 


973 


£75-978 








983- 


984 


988-990 


992- 


993 


996-997 








1000 


1002 1004-1013 


1016 


-1020 








1022 


-1025 1028 1031 


1033 


-1034 








1038 


-104 


5 1050 1055- 


1056 


1059- 








1060 


1062-1064 1067- 


1070 


1072- 








1074 


1076 1078 10S2 


1086 


-1087 








1089 


-1090 1093-1097 


1099 


-1103 








1107 


1109-1113 1116- 


1117 


1124- 








1126 


1128-1131 1134- 


1135 


1138 








1140 


1144-1145 1148- 


1149 


1153 








1157 


1160 1163 1171 


1183 


-1184 








1198 


-1199 1202 1205- 


1207 


1211 








1216 


-1217 1219 1221 


1225 


1229 








1232 


-123 


5 1238-1241 


1243-1244 








1246 


1250 1253 1257-1258 


1261 
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SEQ ID NOS: 



Tissue Origin 



RNA Source 



Hyseq 
Library Name 



1265-1266 1268 1270-1271 1274- 
1277 1280-1283 12B5-1286 1288- 
1290 1293 1295 1298 1308 1312 
1317-1320 1324-1325 1327 1329- 
1330 1334-1335 1338 1342-1343 
1345-1347 1350 1355-1356 1359 
1367 1369 1374 1376 1379 1398 
1400 1406 1408 1414 1417 1419 
1424-1426 142B-1431 1434-1438 
1440-1442 1448 1450 1462-1466 
1468 1472 1474 1478 1487-1480 
1491-1493 1501-1504 1506 1509 
1511 1516 1520-1521 1526 1529 
1531 1536-1537 1539-1540 1546- 
1547 1549 1552 1555 1557-1559 
1561-1565 1568 1571 1575 1578- 
1579 1581-1583 1587-1588 1590 
1592 1597 160S-1606 1611 1613 
1615 1618-1621 1624-1628 1630- 
1631 1634 1636 1638 1641 1643- 
1650 1652-1659 1664 1666-1667 
1669 1671 1675-1681 1683-168B 
1696-1698 1703 1711 1715-1716 
1719 1722-1723 1726 1731-1733 
1736 1739-1741 1743-1744 1749 
1755 1760-1761 1765 1767-1768 
1771-1773 1776 1779 1783-1786 
286 686 1297 1303-1304 1352 
1411-1412 1754 



"Genomic clones 
from the short 
arm of 
chromosome 8 



Genomic DNA 
from 
Genetic 
Research 



EPM001 



esophagus 



BioChain. 



ESO002 



131-132 261 289 380 503 860 092 
1000 1007 1397 



62-63 89 112 126 194 322 336-338 
379 391 411 481 546 563 607 679 
710 867 1012 1031 1055 1251 1262 
1320 1407 1643 1652 1686 1731- 
1732 1746 1765 



Tetal brain 



Clontech 



FBR001 



fetal brain 



Clontech 



FBR004 



68-69 90-91 139 212-213 301 331 
362 374 403 436 611 645-646 659 
668 670 691 785 805 845 1163 
1209 1216 1232-1233 1238-1239 
1387 1410 1416 1430 1496 1536 
1547 1S93 



fetal brain 



Clontech 



FBRQ06 



5-9 25 43 60 62-63 65-66~ 70 72 
80 87 92 101 103 108 114 136 139 
149 152-153 157 168 171-172 175 
207-208 210 212-213 221-226 237- 
238 251-253 266 272 279-281 295 
301-302 307 310 317-318 321-324 
330 333-334 336-338 346-347 352 
357 370 373 377 379-380 382 384 
391-392 397 399 402 406-408 410- 
411 417 421 424 426-427 430 436- 
437 440-443 454 460 464 467 473 
476 483 488-489 495 497 508 510- 
513 516 519-520 524 530 537-540 
544 547 550 561 567 572-574 582 
590-591 595 597 604 607-609 615 
623 628-629 631 634 636-640 655 
657-658 660 665 669 674-675 679 
689 691-694 696-697 699 701 706 
710 716 720 728 732 734 736 742- 
744 757-760 763 775-778 780 799 
806-807 810 817-818 826 839 843 
858 861 864 871-872 864 890-891 
894-895 890 904 915 921-923 935- 
936 938 945 950 952 955-956 958- 
959 961 963 967 969-971 990 992 
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SEQ ID NOS: 



Tissue Origin 



RNA Source 



Hyseq 
Library Name 



999 1001 1005-1006 1008 1013 
1016 1022 1024 1029-1030 1032 
1035 1042 1047-104B 1052 1056 
1065 1067 1070 10S2 1089 1109 



1114-1115 1119 1131 
1151 1153-1156 1160 



1143-1149 
1163 1167 



1172-1173 1178 1184 1186 1188 
1190-1200 1211 1216 1222-1223 
1226-1227 1229 1231 1236 124S 
1253-1255 1258 1260 1262 1266 
1270-1273 1281 1287 1308-1309 
1314 1317-1320 1326 1334-1335 
1339 1341 1344 1350 1356 1369- 
1371 1373 1376 1379 1381-1382 
1386 1392 1396-1398 1419 1423 
1425-1426 1428-1429 1432 1437 
1440-1441 1448 1466 1470 1482 
1502-1503 1507 1511 1513 1516 
1519 1536 1544 1549-1550 1557- 
1559 1573 1589-1590 1598 1608 
1611-1614 1619 1621 1625-1626 
1640 1651 1657-1658 1676-1679 
1693 1696 1703-1704 1713-1714 
1718 1720 1722 1724 1726 1728 
1730-1733 1735-1736 1738-1739 
1742 1745 1755 1759-1761 1765 
1767 1771-1772 1777 1779-1780 
1786 



235-236 520 864 1068 1188 1587 



fetal brain 



Clontech 



FBRS03 



fetal brain 



Invitrogen 



FBT002 



15-18 20-21 24-25 29 34 43 61-63 
77-78 98 101 103 107-108 128 130 
136 146 148 165-166 171 174 181 
185 196-198 204-205 208 223 230 
235-236 251 2S3 261 268-269 280- 
281 284-285 288 309-311 321 329 
334 339 346-347 350 357-359 381- 
383 390 407 418-419 430 434-435 
438 443-444 461 464-466 483 490 
494 509 516 519 522 527 557 561- 
562 572-573 590-591 595 597 623 
632 647-648 650 655 669-670 672 
682 690-691 700-701 710 717 736 
746 782 784 788-789 814-815 825 
829 840-841 847 854-855 857-858 
897-900 904 919 925 935-937 946 
948-949 954 960-962 966 969-970 
9B6 996 1000-1001 1005-1007 1012 
1014 1022-1028 1045 1052 1055 
1068 1070 1072 1078 1082 1085 
1090 1109 1115 111B 1120' 112B 
1136-1137 1144-1145 1149 1156- 
1157 1193-1195 1198 1204-1205 % 
1220 1222 1234 12S7 1262 1271 
1274-1275 1280 1285-1286 1294 
1312 1314 1317-1320 1330 1342 
1344-1345 1349-1350 1355-1356 
13SB 1364 1369 1379 1383-1384 
1431 1435 1476 1507 1519 1532 
1536 1547 1554 1564 1567 1578 
1582 1587 1593 1595 1601 1608 
1615 1619-1621 1638 1644 1661 
1665-1666 1673 1687-1688 1690 
1715 1723 1728 1749 1753 17S7 
1759-1761 1765 1771 1774 1776 
1778 1781-1782 1786 



fetal heart 



Invitrogen 



FHR0O1 



105 124 ISO 299 864 1036 1148 
1229 1614 1616 1762 1785 



fetal kidney 



Clontech 



PKJD0O1 



5-8 11 40 47 57 65-66 82 85 102 
124 163 171 216 222 224 235-236 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



fetal kidney 



Clontech 



258 277 280-281 307 310 314 330 
371 387 392 395 403 422-423 431 
436 443 455 469 500 519 522 542 
563 572-573 585 600 619 623 650 
654 657-658 660 679 719 731 780 
798 821 B33 844 854-855 857 864 
868 878 911 929 958 960 969 990 
992 1007 1046 1087 1103 1129 
1139 1285 1312 1331 1355 1369 
1371 1376 1391 1422 1425-1426 
1440-1441 1470 1543 1598 1601 
1618 1631 1651 1654-1655 1669 
1678-1679 1691-1692 1733 1705 



FKD002 



fetal kidney 



352 384 42^-427 440 583 602 10^0 
1131 1324-1325 1636 



fetal lung 



Invo.trogen 
Clontech 



FKD007 



20-21 82 163 335 679 988-989 
1000 1227 1230 1320 1554 



FLG001 



fetal lung 



Invitrogen 



FLG003 



35-36 94 323 371 398 426-427 445 
473 549 560 604 616-617 626 631 
649 651 719 746 786-787 832 842 
849-850 864 894-895 1075 1178 
1182 1200 1206 1309 1311 1345 
1429 1493 1567 1576 1620 1686 



fetal lung 



Clontech 



9 15-16 29 41 47 68-59 83 88-89 
102 124 137 152-153 165 196 224 
229 231 249 254 256 267 291-292 
300 325 333 344-345 352 373 376 
379 384 408 426-427 430 432 467- 
458 475 483 488 493 516 531 535 
545 547 549 564 582 602 623 644 
660 662-664 670 673 725-726 728 
761 766-767 774 805 830 852-853 
864 875 921 932 937 946 949 963 
988-909 1014 1016-1017 1024 1027 
1090 1097 1170 1185 1200 1215- 
1216 1224 1258 1290 1309 1320 
1342 1347 1355 1369 13B1 1413- 
1414 1431 1438 1449 1491 1512 
1536 1547 1557-1560 1567 1590 
1601 1636 1644 1653-1655 1662 
1667 1671 1675 1680-1681 1706 
1739 1760-1761 1769 



FLG004 



103 276 334 
1614 165B 



465-466 737 843 1131 



fetal liver- 
spleen 



Columbia 
University 



FLS001 



3-11 13 15 
51 54 56-58 
77-80 82-83 
110 112 116 
135-139 141 
157 163-165 
180 166 188 
200 202-206 
233-236 240< 
255-256 258 
274 276-278 
293 295 299 
311 314 316 
332 342 344- 
358 360 362 
386-387 390 
406 408 410- 
437 439-442 
456 459 461- 
487-488 490- 
506 S09-513 
529 531 534 
553-554 561- 
576 579 581 



21 25 30-39 41-4 
60-66 68-69 72 
85 87 89 92-103 

-124 126-127 130 
144 147-149 152 
167-172 174 176 

•190 193-194 196 
210-214 219 221 

•244 246-247 250 
261-265 266-269 
200-281 284-266 

•301 304 306-307 
318 320-321 326 

■345 350 352-353 
370-374 376 378 
392-393 400-401 
412 415 417 419 
444-445 448 452- 
470 472-479 481- 
491 493 500-501 
515-520 522-524 
536-540 542 547- 
562 564 567-568 
583 585-597 599- 



8 50- 

75 
105- 
133 
153 

-17B 
198- 

-231 

•251 
272 
288 
309 
329- 
3S6- 
384 
403 
422- 
454 
483 
503- 
526- 
549 
571- 
605 
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Tissue Origin 



RUA Source 



Hyeeq 
Library Natue 



SEQ ID NOS: 



607 610-613 615-621 623-624 626 
628-634 636-640 644 647-650 655- 
660 665 669-670 672 674-675 678 
681-682 684 690-695 697 702 708- 
710 713-714 716-719 725-728 730- 
731 734 736 738 740-741 743-746 
748 750-751 7S9-766 768 772 7v74- 
777 779 783-788 793 796 798 800- 
805 808 810-812 814 810-819 821- 
824 826-B32 834-837 B43-847 849- 
867 869-876 878-883 887 889-895 
897-898 902 904-914 916 919 921- 
928 930-937 939 945-950 953-958 
9S0-961 963-965 967 969 971 974- 
978 980-9B3 986 988-990 992-993 
995-997 1000-1002 1004-100B 1012 
1014 1016-1019 1025-1026 1028- 
1031 1033 1035-1036 1039-1044 
1047 1049-1050 1053-1056 1058- 
1059 1061-1064 1067-1070 1072- 
1074 1076 1078 1082 1085-1087 
1089-1090 1097 1099-1103 1107- 
1113 1115-1119 1121-1123 1125 
1127-1128 1131-1134 1136-1137 
1144-1150 1153 1159-1160 1163 
1170 1175 1177-1178 1188 1190- 
1192 1195-1200 1202 1206 1208- 
1211 1214 1216 1218 1221-1222 
1225 1227 1234 1237 1241 1244 
1246-1247 1251 1254 1258 1261 
1266 1268 1270-1273 1277-1282 
1284-1285 1287-1290 1294 1299- 
1300 1306-1308 1313-1320 1324- 
1325 1327 1330 1332-1333 1338 
1341 1343 1345-1347 1349-1350 
1353-1360 1362-1363 1365-1367 
1369-1370 1372-1374 1376 1378- 
1381 1383-1384 1386 1389-1391 
1400 1402-1403 1405-1410 1413 
1415 1417-1419 1422-1429 1431 
1435-143,7 1439-1442 1445-1446 
1448-1449 1454 1458-1459 1466- 
1470 1472 1474 1477-1478 1480 
1482 1485 1491-1493 1496-1498 
1501-1507 1509 1511-1512 1516- 
1519 1524-1526 1529 1532 1536- 
1541 1546-1547 1549-1550 1552- 
1554 1562 1564 1569 1572 1574- 
1575 1578 1S81 1583 15B7-1588 
1591-1592 1594-1595 1597-1598 
1600-1604 1611-1612 1614-1615 
1617-1618 1620-1622 1624-1625 
1627-1628 1630-1632 1634-1639 
1645-1651 1653-1662 1664 1667- 
1669 1671 1673-1674 1676-1688 
1690 1696 1701-1703 1706-1709 
1711 1713-1714 1718-1719 1722 
1724-1727 1731-1733 1738 1740- 
1741 1743-1744 1746 1748 1751- 
1752 1754 1760-1765 1767-1773 
1780 17B3-17B6 



fetal liver- 
spleen 



Columbia 
University 



FUS002 



3-11 13 15-21 26 29 32 35-39 42 
44-45 4B 50-51 54-55 57-58 61 64 
68-69 73-75 78 80 82 84 87 95-98 
100 103 105 107-108 110 112-113 
116-119 122-125 128 130 137-13B 
145 147-153 155 157 159 161-163 
166 168 171-172 174-175 177 181 
188-189 193-194 196-198 200-203 
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SEQ ID NOS: 



Tissue Origin 



RNA Source 



Hyseq 
Library 



206 212-215 219-221 223 225-229 
231-232 240-244 246-247 250-251 
258-259 262 264 268-269 272 275 
277 280-281 284 286 288 290-292 
295 29B-299 301-304 306 308-310 
318 320-321 323 325 329 331 334 
342 348-349 352-353 356 359 368 
371 374 376-379 381-384 386-387 
392-393 397-398 400-401 403 410- 
413 421 423 426-427 429-430 433- 
436 438 440 443 445 448 451-452 
454-455 460-463 465-467 469 471- 
473 475-475 478-479 481-483 487 
490-491 493-494 497 500-501 SOS- 
SOS 509-513 515-517 519-520 524 
526-531 534 537-542 544 547 552- 
561-562 564-567 571- 



554 556 558 
577 583-587 590-591 
601 604-606 608-613 
624 626-632 
649-652 654 



593 595 597 
616-617 619- 
634 637-642 644 647 
659 662-665 669-672 



674-675 681-682 685 688 690 696 
698 700-703 707 709-710 713 717 
719-721 723-724 728 731-732 734 
737-738 742-745 748 752 754 759 
763-766 768 770 773-777 780 782 
784 786 791 795-798 801-802 805 
808 811-812 818 823-824 826-827 
832 834-837 839 843 846 848-856 
858-861 865 867 869 871 873-874 
876 878 881-882 887 889 892 894- 
898 901-902 904 906-908 913-915 
919 921-924 926-932 934-935 937 
939-941 943 946-947 950 953 958 
961 965-967 971 973-975 977-979 
981 984-985 990 992-993 995-997 
999 1001 1004-1007 1009-1011 
1013 1016 1020 1023 1025 1027- 
1031 1033-1035 1039-1042 1044- 
1045 1049 1053 1055-1056 1058- 
1059 1062 1064-1065 1067-1070 
1072-1074 1079 1082 1087 10B9 
1093 1097 1099-1103 1105-1107 
1109-1114 1123 112S-1127 1132- 
1134 1140 1143-1145 1148-1150 
1156 11S8 1160 1163 1172-1173 
1177-1178 1181-1184 1190-1192 
1195-1197 1199 1204 1206 1208 
1211 1214 1216 1219 1227 1230 
1234-1235 1237 1240-1241 1243 
1245 1247 1256 1258 1260-1261 
1264 1268 1270-1271 1275 1278- 
1279 1284-1286 1288-1289 1299- 
1301 1306 1308 1312 1314 1317- 
1319 1323-1325 1327-1330 1334- 
1335 1339 1343-1347 1349-1350 
1354-1355 1357 1360 1362-1363 
1365-1367 1369 1372 1376 1378- 
1380 1386 1369-1391 1394 1400 
1403 1406 1409 1416-1419 1422- 
1427 1429 1435 1437-1438 1440- 
1442 1446 1448-1450 1453 1460- 
1461 1468 1470 1472 1474-1475 
1478 1482 1486 1490-1493 1496 
1498 1500-1504 1506 1508-1509 
1511-1512 1516 1518-1519 1521 
1524-1528 1531 1536-1538 1543 
1547 1550 1554 1556 1564 1567- 
1569 1580 1587-1588 1591-1592 



125 



WO 01/53312 



PCT/US00/34263 



Tissue Origin 



RNA Source 



Hyeeq 
Library Name 



SEQ ID NOS: 



•1528 
1646- 

•1662 
167S 

-1692 
1714 
1730- 
1748- 

•1764 
1779 



1600^ 

1630- 

1649 

1664 

1683- 

1699 

1717 

1733 

1752 

1767 

1783- 



l^oT 

1631 
1652 
1667 
1684 
1702 
1719 
1738 
1758 
1769 
1786 



1611- 
1635- 
1654- 
•1669 
1666- 
1707 
1722 
1740 
1760- 
1772- 



1612 

1638 

1659 

1674 

1688 

1711 

1726- 

1743- 

1761 

1773 



1597 - 
1618 
1641 
1661 
1676 
1691 
1713 
1727 
1744 
1763 
1776 



fetal liver- 
spleen 



Columbia 
University 



FLS003 



103 300 318 321 352 372 379 381 
3B4 392-393 403 422 424 429 434- 
435 440 444 453 503 515 544 592 
978 1064 1324-1325 1327 1333 
1357 1369 1378 1418 1424 1622 
1646 1649 1680-1681 1689-1690 
1717 1743-1744 1769 



15-16 26 34 58 61 64 70 75 78 89 
98 105 112 116 120-121 123 133 
151 166 176 180 194-196 198 200 
204-206 210-211 520 225-226 230 
235-236 239 247 259 261 267 272 
277 280-281 303 310 313 317 320- 
321 329 344 356 371 374 376 379- 
382 39S 408 412 414 419 429 434- 
435 441-442 465-466 490 494 504- 
506 509 522 527 534 552-553 562 
567 569-570 572-574 607 631 657- 
658 667 669 672 685-686 702 717 
725-726 732 748 759 761 778 784 
786 009 017 829 037 B57 861 872- 
873 875 881 889 894-895 909 911 
916 954 963 967 974 977 986 9B8- 
989 993 995 997 1000 1005-1006 
1008 1014-3015 1020 1042-1043 
1070 1086-1087 10B9-1090 1118- 
1119 1122 1144-1145 1148 1153 
1157 1159 1183 1195-1196 1227 
1250 1257-1258 1262 1267 1280 
1285 1307 1312 1314 1317-1320 
1344-1345 1349-1350 1355 13.62- 
1363 1403 1405 1415 1419 1425- 
1426 1429 1431 1442 1448 1463- 
1464 1469-1470 1489 1528 1536 
1539 1549-1550 1557-1562 1577 
1583 1598 1601 1611 1615 1622 
1544 1649 1666 1674 1706 1721 
1738 1746 1763-1765 1774 1776 
1779 



fetal liver 



Invitrogen 



FLV001 



fetal .liver 



CTontech 



FLV002 
FLV004 



676 998 1719 

93 133 214 301 355 374 379 555 
581 601 679 837 847 859 1123 
1236 1270 1313 1324-1325 1327 
1355 1367 1425-1426 1536 1690 
1733 1760-1761 



fetal liver 



CI on tech 



84 St! 89 98 



fetal muscle 



Invitrogen 



FMS001 



26 37-39 50-51 58 
113 128 131-132 139 
194 198 201 206 211 
261 276 282 286 302 
375 379 383 398 412 
436 448 452 462-463 
519 529 561 569-570 
607 623 626 635 647 
725-726 730 733 761 
826 837 860 874 913 
970 980 986 988-990 
1001 1007 1014 1027 
1045 1060 1064 1070 



361 
430 



155 172 186 
230-231 256 
325 359 
413 419 
473 477 503 
590-591 597 
660 672 715 
775-777 788 
915 921 935 
992 1000- 
1035-1036 
1083 1097 
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Tissue Origin [ RHA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1099- 

1173 

1266 

1324- 

1383- 

1433 

1557- 

1632 

1712 

1766 



1102 
1198 
1270 
1325 
1364 
1505 
1559 
1644 
1725- 



1116- 
1208 
1277 
1329 
1399- 
1514 
1562 
1650 
1726 



1117 
122B 
1298 
1336- 
1400 
1542 
1589 
1652 
1743- 



1121 
1240 
1317 
1337 
1403 
1551 
1599 
1671 
1744 



1164 
1258 
1320 
1369 
1409 
1554 
1620 
1675 
1754 



fetal muscla 



Invitrogen 



FMS002 



119 221 273 402 426-427 463 547 
599 736 869 1000 1033 1083 1266 
1431 1440-1441 1468 1545 1599 
1673 1678-1679 1687-1688 1710 
1712-1714 1723 1725 1731-1733 
1743-1744 1760-1761 1767 



tetal skin 



Invitrogeh 



FSK001 



1 4-11 15-16 20-23 25 29 33 40 
43 46 56-57 60-61 64-66 75 82 87 
97-98 105 107-108 113 118-119 
123 133 135-137 139 144 146 148 
151-153 156 163 170 176 180 188- 
189 197-198 200 202-203 210 218 
222 231 246-247 261 263 265-270 
277 285-286 290 293 299 301 307 
311 321 325 32B 330 333-335 339 
341 345 351-352 355-356 358-359 
362 368 370 372 376 379-382 384 
388 394 404-405 408-409 411-412 
419-420 424 426-427 436 441-442 
445 448-449 454 462 465-466 472 
476 490 493 504 506 509 515-517 
S19 526 531 537-540 547 549 560- 
561 567 572-573 581 584 589 611- 
612 615 623 630-631 635 647 649 
651 657-658 660 662-665 667 669 
672 676 678 681 688 701 704-705 
709-710 713 717 720-721 725-726 
728-729 732 748 750 753 759 764 
766 770 775-777 780-781 786 788- 
789 798 809 811 814 816-817 822 
824-826^831 842 857 859 861 863- 
864 881 894-895 908 910-911 916 
918 922-923 928 932-933 935 937 
946 948-949 953 960-961 966-967 
970 975 977 986 990, 992-993 999- 
1000 1004 1007 1013 1018 1025 
1027 1032 1035 1041-1043 1054 
1057-1058 1060 1062-1064 1069 
1072 1077 1090-1091 1097 1099- 
1103 1108 1113 1119 1123 1128 
1131 1134 1140 1148-1149 1152- 
1153 1156 1163 1167 1178 1182 
1189 1192 1195-1196 1198 1201- 
1205 1208 1211-1212 1216 1219- 
1220 1222 1225 1240 1243 1258 
1266-1267 1274 1277 1280 1282- 
1285 1299 1310 1317-1322 1324- 
1325 1329-1330 1342 1344 1346 
1349-1351 1354-1357 1365-1366 
1369 1371 1373 1376 1378 1380 
1383-1384 1387 1399-1400 1405 
1410 1427 1429 1431 1433-1435 
1439-1441 1448-1449 1454 1457 
1468 1470 1472 1475 1480-1481 
1487 1490-1491 1493 1498 1509 
1512 1521 1525-1526 1529 1535- 
1536 1547 1549 1557-1S59 1588 
1592 1595 1597-1598 1601 1603- 
1604 1608 1611 1614 1618 1624- 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS; 



-1636 
•1657 
16B5 
•1710 
1732 
1755 
1777 



1641 
1660- 
1687- 
1716 
1737- 
1760- 
1779- 



1643T 

1662 

1689 

1719 

1740 

1761 

1780 



1626 
1644 
1665 
1702- 
1724 
174 2 
1765 
1786 



1632 
1646 
1668 
1703 
1727 
1747 
1772 



1634 
1654 
1675 
1709 
1731 
1749 
1776 



fetal skin 



FSK002 



Invitrogen 



fetal spleen 
umbilical cor3" 



13 286 302 307 
339 341 354 370 
408 414 426-427 
515 544 585 598 
1076 1109 1155 
1333-1335 1343 
137i 1377-1378 
1466 1647 1656 
1688 1693 1718 
1732 1739 1755 



313 321 330 335 
372 365 400 402 
433 436 450 4 54 
767 810 945 939 
1317-1320 1326 
1347 1350 1369- 
1391 1397 1422 
1678-1679 1687- 
1721 1725 1731- 



Biochain 
BioChain 



FSP001 
FUC001 



110 137 211 353 *89 9*7 1108 
1639 1771 



4-8 10 12 14 17 33^36 44-46 57 
64 68-69 75 82 85 101 104 113- 
114 116 119 122-124 133 137 153- 
154 157 161 163 166-167 175 181- 
184 186 192 197-19B 200-202 212- 
215 230 234 246-247 251 256 263 
267 271-272 280-281 284 295 301 
314 317 321 326 333-335 345 351 
356 368 371-373 379-380 386 390 
392 394 406 408-410 412 414 416 
420 424 427 430-436 438 444-446 
454 459 461 463 467 473 482-483 
486 488 490 495 504 509 524 526 
537-540 547 555 561 574-577 588- 
591 593 606 615 620-621 632 637 
645-647 650 559-660 662-664 667- 
668 674-675 6B4 687 696 698 701 
703-705 709 711 714 719-720 725- 
727 732 749-750.762 765 771 775- 
777 780 789-791 793 796 802-803 
814-817 822 833 843 845 848 858 . 
861 864 875 879 888 894-895 897- 
900 903 906-907 911-912 925 930- 
933 936 940 94B 953 960 966 977 
984 990 992 99B 1000-1001 1005- 
1007 1016 1023 1025 1037 1046- 
1047 1059 1061-1063 1073 1076- 
1077 10B9 1094-1097 1112-1113 
1115 1134 1144-1148 1131 1154 
1156 1163 1171 1197 1204-1205 
1208 1216 1218 1224 1234-1235 
1243-1244 1246 1279 1283 1286- 
1287 1298 1316 1320 1344 1346 
1350 1357 1359 1371 1373 1375 
1381 1398 1400 1403 1408 1414 
1424 1427-1428 1431 1433 1440- 
1442 1446 1454-1455 1479 1482 
1484-1485 1489 1492-1493 1504- 
1505 1513 1525 1527 1536 1538 
1546 1565 1567 1571 1573 1575- 
1576 1578-1579 1591 1595 1600- 
1601 1608 1612 1615 1621 1624 
1626 1636-1637 1647-1648 1651 
1653 1656 1658 1661-1662 1672 
1675 1682 1684 1686-1688 1690 
1709-1710 1722 1727 1729 1735- 
1738 1740-1741 1760-1761 1768 



fetal brain 



GIBCO 



HFB001 



4 9 11-13 17-18 22-23 25 37-39 
42-47 50-51 54-55 58 60-61 65-66 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



"SEQ ID NOS: 



72 75 77 80 82 85 90-91 94 100- 
102 107 110 112-116 118-119 122- 
123 126 128 134 136-140 147-148 
153-155 157 161 165 169-172 175 
181 186 188-189 197-198 204-206 
208 210 215 222-223 225-226 230 
235-238 240-241 247 253 256-258 
260-262 267-269 276 279-201 284 
286 289 298 300-302 307 310 318 
321-323 325 330-331 339 341 346- 
349 352 354 356-359 362 364-36S 
371-372 377 379-380 382 384 387 
390 400 408 414-416 419 424 431 
434-435 438 441-443 449 451 453- 
455 457-463 470 472-473 475 477- 
478 482-483 486-488 490-491 493 
496 499-500 502-504 506-507 509- 
512 516 519-520 522 525-526 529- 
530 537-540 543-544 546-547 566- 
567 569-570 572-582 S85 588 590- 
591 593 595 S99 601 604 606-609 
611-612 614-620 622-624 630 632 
636 643 645-647 650-652 654 659 
661 665 667-668 670-672 676 678 
681 687 689 692-694 697 699 710 
714 717 721 727 729-732 734 736 
738 743-746 750-751 759 763 766 
770 772 775-777 784 789 791 796 
799 802-805 810-811 814 819-821 
824 826 830 834-837 839-850 854- 
656 858-660 862 864 869 871 876- 
877 879 883 886-887 890-891 893- 
095 898-901 905 908-910 912-916 
919 922-923 925 927 930-933 935- 
938 948 952-960 963-964 967 969- 
972 97S 978-979 981 983 986-987 
990 992 995 997 999-1002 1005- 
1009 1011-1013 1016 1018-1019 
1023 1026 1029-1031 1033-1035 
1038 1041 1047 1050 1053 1057 
1059 1064 1068 1070 1072-1073 
1078^1079 1081-1082 10B6 1089 
1094 1097 1103 1107-1109 1113- 
1115 1121-1122 1127 1134-1135 
113B 1140 1143 1148-1151 1153 
1156-1157 1159 1167 1170 1175 
1193-1194 1200 1202 1207-1209 
1211 1216 1219-1220 1226-1227 
1229 1232-1234 1240-1241 1243 
1246 1249-1251 1253-1254*1258 
1267-1268 1271 1276 1279 1282 
1285-1289 1293-1294 1305 1307- 
1308 1312 1316 1320 1327 1338- 
1339 1341-1344 1346 1349 1355- 
1357 1359 1365-1366 1369-1370 
1373-1375 1379 1386 1389 1394 
1398 1409 1413-1414 1416-1417 
1420-1421 1425-1427 1430 1433 
1437 1439 1442 1445-1452 1454- 
1457 1459 1463-1464 1468 1470 
1474 1477-1479 1489 1492 1494 
1497-1498 1501-1503 1507 1509 
1511-1513 1517 1520-1521 1524- 
1526 1531-1533 1535 1537-1538 
1547 1554 1556-1559 1564-1567 
1571 1584 1587 1589 1594 1599- 
1601 1611-1612 1614-1616 1619- 
1620 1625-1628 1630-1631 1634 
1637-1638 1640-1643 1645 1648- 
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Tissue Origin 



RNA Source 



Hyeeq 
Library Name 



SEQ ID NOS: 



macrophage 



Invitrogen 



1649 1651 1653 
1664-166S 1667 
1679 1683-1684 
1704-1705 1709 
1720 .1724 1727- 
1737-1738 1743 
1755 17S7 1760 
1779 1785 



1655 
1669 
1686 
1713- 
1728 
1744 
1761 



1657- 

1673 

1693 

1714 

1731- 

1752 

1765 



1658 

1678- 

1701 

1717- 

1733 

17S4- 

1772 



HMP001 



5-8 110 204-205 
878 933 988-989 



503 634 678 859 
1379 1448 1504 



Columbia 
University 



IB2002 | 10 12-13 l£-18 22-23 25 29 34 

37-39 43 47 50-51 54-56 58 60-63 
65-66 68-69 72-74 80 82-83 B6 
88-92 97 100 102-104 106-108 110 
112-113 115-116 118 123 128 130 
134-136 138-139 143 147-149 151- 
152 154-155 163 165-167 169 172- 
175 181-184 186 193-196 198 201 
203-205 209-210 214-215 222 224 
226 231-232 235-236 239 246-247 
252 257 260 268-269 272 276-277 
279-281 286 288 291-292 295 298 
300-301 304 307 310 313 321-323 
330-331 333-334 339 346-347 349 
352 356-357 362 371-372 377 379- 
380 383-384 392 397 401 406 408 
411 413-414 416 418-419 422 428 
430-431 434-435 438 443 449 453- 
454 461 464-466 469-470 472-473 
475-476 478 482-483 487 490 492 
494 497 503 507-508 S10-513 516 
519-520 524-526 530-534 S36-540 
S47 550-551 561 563-564 566-567 
572-576 579 581-582 584-507 590- 
591 593 595-597 607-609 611-613 
616-617 620 622-624 627 631 637 
641 645-647 650-655 657-658 660- 
665 667-675 689 691 69S 697 699 
703 707 713-715 717 721 728-731 
733-736 739 743 745 7S1 755 759 
763 769-770 772 778 780-781 785 
788-789 793-794 799 803 808 811 
814 825-826 830 834-836 840-843 
845 848-850 854-855 860 862 864- 
865 870 872 875-876 878 886 888 
890-891 894-896 898 903-904 916- 
917 919 922-925 927-928 930-932 
934-936 939 941 945-946 948-950 
953-954 959-962 966-969 977 979 
981 986-990 992 997 999-1000 
1004-1006 1014 1016 1018-1019 
1024-1025 1033 1036 1047 1051- 
1052 1054-1055 1057-1059 1063- 
1064 1068-1070 1073 1081-1082 
1085 1089 1108-1113 1118-1120 
1123-1124 1130 1132-1138 1140 
1149 1151 1153-1154 1163-1170 
1172 1174-1175 1183-1184 1188 
1190 1193-1194 1196-1197 1199 
1204 1208-1209 1211 1218-1222 
1226-1227 1229 1231 1234 1241 
1247 1249 1251 1256 1258 1261- 
1262 1269 1274 1279 1281 1283 
1285 1287-1289 1294-1295 1305 
1307 1313-1314 1316-1320 1329 
1332 1341-1342 1345 1349 1356 
1362-1363 1365-1366 1368-1370 
1374 1381 1383-1384 1388 1400 
1403 1406-1407 1413 1417 1420 
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Hyseq 
Library Name 



SEQ ID NOS: 



1423 
1441 
1454 
1468 
1483 
1499 
1522 
1542 
1555 
1580 
1593 
1610 
1624 
1639- 
1654- 
1672- 
1693- 
| 1717- 
1733 
1755- 
1777- 



1429 
1443 
-1455 
1470 
1485 
1502 
-1523 
1546 
1563 
1583 
1595 
1612 
1626- 
•1640 
•16S5 
•1673 
•1595 
1720 
1735- 
1758 
1770 



-1431 
1447 
1457 
-1471 
1493 
-1503 
1525 
-1547 
1565- 
-1586 
1598 
1614- 
-1627 
1642 
1658- 
1676- 
1701- 
1723- 
•1741 
1762 
1786 



1435 
-1449 
1459 
1475 
-1494 
1505« 
1528 
1549- 
-1567 
1588 
1600- 
-1616 
1630- 
1644 
1659 
•1681 
1702 
1724 
1743- 
1765 



•1436 
1451 
1463 
1479 
1496 
-1507 
1531 
-1550 
1569 
1590 
-1601 
1619 
-1633 
1647 
1664- 
1685- 
1704 
1726- 
1744 
1771 



1439- 
-1452 
-1465 
1482- 
1490- 
1509 
: 1533 
1554- 
1S75 
1592- 
1608- 
1621 
1637 
1652 
•1665 
1688 
1708 
1728 
1752 
1774 



infant brain ( Columbia 
University 



IB 2003 I 17-18 20-23 29 34 43 60 68-69 

78-80 88 100-101 107 110 112 118 
123 128 133 135-137 146 148 152 
159 166 169 174 194 198 203 215 
223 225-226 229 23S-236 247 260 
276-281 286 290-292 295-300-301 
324 331 334 339 346-347 



310 322 

349-350 352 357 371 376-377 382 



infant brain | Columbia 
University 



IBM002 



infant brain | Columbia 
University 



IBS001 



384 403 408-409 414-415 453-4S5 
472 476 47B-479 490 503 507 516 
520 530 534 536-540 5Sl 563 572- 
576 585 587 590-591 S93 59S-596 
601 606 612 616-617 620 622-624 
650 652-653 661 665 670-671 674- 
675 678 C09 71S 717 727-728 730 
734 759 775-777 780-781 785 796 
806-807 811 824 845-846 864 869 
875 882 8B9 894-895 898 904 917 
919 921-923 932 935-936 946 950 
954 962 977 979 997 999-1000 
1005-1006 1009 1011 1017 1024 r 
1033 1037 1043 1055 1057 1109 
1114-1115 1120 1123 1127 1144- 
1145 1149 1151-1153 1160 1167 
1170 1174 1193-1194 1196 1199 
1 1202 1206 1209 1220-1221 1226 
1229 1240-1241 1251 1258 1284 
2288-1289 1305 1314 1327 1333 
I 1344 1347 1350 1356-1357 1365- 
1366 1378-1379 1388 1400 1403 
1421 1423 1431 1436 1440-1441 
| 1446-1447 1457 1459 1471 1499 
1503 1507 1509 1535 1546 1SS7- 
1559 1567 1572 1587 1595 1598 
1610-1612 1615 1631 1639 1644 
1647 1657-1658 1673 1678-1681 
1683-1684 1701-1702 1708-1709 
[ 1713-1714 1719 1757 1760-1761 
1765 1771 1778 

101 113 139 152 2tf0 279 290-292 
374 377 551 563 608-609 653 659 
814 954 1005-1006 1029-1030 1130 
1164 1209 1258 1294 1305 1320 

' 1327 1397 1431 1498 1507 1615 
1640 1694-1695 1763-1764 1767 

1 1779 

10 12 119 175 279-281 321 334 

371 446 551 563 623 652 667 669 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ ID NOS: 








Library Name 


























412 


819 


949 


9*6 


1113 


113 


0 








1151 


1188 1193- 


1194 


1196 


. 122 


9 








1258 


1265 1271 


1287 


1317 


-1319 








1324 


-1325 1342 1423 


1440 


-1441 








1448 


1471 1482 


1525 


1532 


1546 








1562 


1569 1588 


1591 


1610 


1618 








1647 


164 9 1658 












lung, 


oL-iaLcy cue 




5-9 


17 20-21 25 


6B-69 82 


94 


105 








153 


157 


197- 


198 


203 


207- 


208 


212- 








213 


223 


262 


266 


283 


302 


321 


326 








333 


356 


370 


427 


430 


436 


446 


462 








472 


493 


498 


503 


516 


519 


527 


53 S 








537- 


540 


542- 


-544 


562 


565 


567 


586 










600 


607 


615 


630 


647 


662-664 








692 - 


694 


712 


719 


745 


748 


775-777 








/ j»i — 


796 


610 


837 


043-847 


849 


854- 










869 


876 


903 


934 


953 


955- 


956 








yes 


975-976 


984 


1000 1005-1007 










-1025 1033 


1039 


1053 


1064 








1070 


1072 1082 


1112- 


-1113 


1134 








1136 


-1138 1140 


1195 


1223 


1232- 








1233 


1246 1279 


1285 


1295 


1311 








1320 


1334-1335 


1343 


1427 


-1428 








1446 


1478 1482 


1493 


1504 


1537 








1552 


1555 1567 


1575 


1582 


1598 








1620 


1625 1632 


1638 


1645 


1654- 








1655 


1662 1680-1681 


16B4 


1686 








1690 


1696 1702 


1711 


1733 


1741 








1760 


-1761 1778 


17B5 








lung turaor 


, .1 — 

Invitrogen 


LGT002 


5-10 


18 


20-21 29 33-36 40 43 52 




54-55 61 65-66 


58-70 73- 


75 80 85 








88-B9 93-94 


100 


103 


106- 


108 


112- 








113 


115-116 


118 


-119 


123- 


124 


126 








130- 


132 


135- 


-137 


139 


•141 


143-144 








147- 


148 


151- 


-153 


155- 


•156 


159 


161 








164 


169 


171 


179 


-180 


185 


190 


192 








194 


196- 


-199 


203 


-208 


210 


212- 


•214 








216- 


217 


219 


222 


233 


240- 


241 


244 








246 


251 


-252 


255 


-256 


261- 


262 


266. 








272 


276-277 


279 


-281 


284 


286 


288 








,290 


295 


298 


301 


-302 


309- 


312, 


317 








321 


329 


332 


341 


-342 


344- 


345 


34B 








352 


358 


-360 


363 


368 


370- 


371 


376 








380- 


381 


3 84 


389 


-390 


398 


400 


409 








414 


423 


426-427 


430 


432- 


436 


443- 








444 


450-451 


454 


462 


468 


472- 


■477 








480- 


483 


487- 


-488 


490-491 


493 


496- 








49B 


500 


503- 


-506 


509 


-512 


515- 


•516 








519 


521 


-523 


526 


530 


534 


541 


544 








547 


554 


557 


564 


566-567 


572- 


•576 








585- 


586 


588- 


-589 


595- 


•596 


601 


607 








611- 


612 


615 


619 


621 


623 


626 


630 








632- 


633 


644 


647 


649 


651 


655- 


-656 








660 


662 


-665 


667 


669 


672 


683- 


-684 








696 


700 


706 


710 


713 


716 


718-719 








722- 


723 


728 


73 4 


-739 


743 


7S0 


752 








763 


765-766 


773 


-778 


784- 


785 


787- 








789 


791 


800 


B02 


-B03 


809- 


812 


814 








824 


626 


628-829 


832 


838- 


839 


841- 








845 


849-850 


852 


-855 


857- 


861 


864 








866 


874 


878-880 


882 


807 


890-891 








897- 


898 


902 


904 


90S- 


-907 


910 


916 








918- 


920 


922 


924 


-925 


927 


930- 


•932 








934- 


935 


937 


947 


950 


953 


955- 


-956 








961 


963 


966- 


-967 


969 


971 


977- 


-979 








981 


984 


986 


-987 


990 


992- 


993 


995 I 








997 


999 


-1001 1005-1007 1009 










1012-1013 1018 


1020 


1022 


-1024 








1026 1029-1030 


1033 


1038 1041 
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Tissue Origin 


RNA Source 


Hyseq 
Library Name 


SEQ ID NOS: 








104S 


1047 


-1050 


1052 


1054 


-1055 










1063 


-1064 


1067 


-1071 


1073- 








1074 


1078 


1085 


1087 


1089 


1095- 








1097 


1104 


1106-1107 


1109 


1112 








1116 


-1117 


1119 


1126 


1134 


-1135 








1139 


1141 


-1142 


1144 


-1145 


1148 








1152 


-1153 


1156- 


-1158 


1157 


1170 








1172 


1178 


1195-1196 


1198 


-1200 








1202 


1204 


1206 


1214 


1216 


1219 








1222 


1227 


1234 


1241 


1247 










1257 


-1258 


1265 


1267 


_ 1270 


±£ /O 








1278 


1280 


-1281 


1283 




±60 o - 








1289 


1295 


1300 


13 05 


1308 










1317 


-1321 


1329 


1338 


- 13(13 


J.J11 








1344 


-1346 


1349-1351 




-IJ55 








1357 


1365 


-1366 


1369 


1j / p 


-1379 








•L J Dj 


-1385 


1394 


1397 


1400 


1402 - 








14 03 


1408 


1417 


1419 


1423 


-1426 








1431 


1433 


-1436 


1438 


1444 


1446- 








1 A AO 


1454 


-1455 


1460 


1466 


1468 








1470 


1474 


1480- 


•1481 


1483 


1486- 








1488 


1490 


-1491 


1494 


-1496 


1506 








1508 


-1509 


1511- 


1512 


1515 


-1516 








1519 


1523-1524 


1528 


-1529 


1536- 








1540 


1546 


1549-1550 


1555 


1560- 








1561 


1565 


1567 


1569 


1575 


1588 








1591 


1593-1594 


1596 


-1598 


1600- 








1602 


1608 


1614- 


1616 


1618 


1620 








1624 


-1625 


1627- 


1632 


1636 


1639 








1644 


-1645 


1647- 


1649 


1652-1653 








1656- 


-1662 


1664 


1666-1667 


1670- 








1671 


1673-1675 


1678-1679 


1683 








1685-1688 


1690- 


1692 


1696-1699 








1705 


1709 


1716- 


1717 


1722 


1727 








1730 


1735 


1739 


1741 


1743-1744 








1748-1749 


1753 


1760-1762 


1765 








1767 


1770-1771 


1773 


1775-1776 








1778- 


-1779 


1786 








lymphocytes 


ATCC 


LPC001 


4 11- 


-12 18 24-25 30- 


-31 48 50-51 








56-57 68-69 80 


92 98 103 


105 110 








126 137 152-153 


157 


165 172 188- 








1B~9 197 203 210 


2X7- 


•218 222-223 








225-226 229 231 


247 


251 256 264 








272 280-281 284 


300- 


•301 321 325- 








326 339 348 352 


357 


371 382 384 








390 400 404 412 


414 


421 423 426- 








427 430-431 445 


447- 


448 451 454- 








455 4 


75 503 516 


S26- 


527 530 537- 








540 549 556-560 


563 


574 577 5R9 








602 613 615-617 


621 


623 628-630 








636-637 647 649 


657-659 690 697 








717 723 755 764 


775-777 780 786 








789-790 793 800 


802 


822 838 849 








866 869 876 881 


-883 


892 898 906- 








907 911 921-923 


928 


975 990 992 








996 1001 1004-1007 1033 1050 








1054 


1078 


1107 


1135 


1140- 


1141 








1143 


1148 


1158 


1163 


1177 


1199 








1205 


1216 


1226 


1231 


1236 


1241 








1244 


1250 


1258 


1260 


1265 


1269- 








1271 


1290- 


1293 


1308 


1312 


1317 








1319- 


1320 


1339 


1345- 


1346 


1348 








1350- 


1351 


1357 


1367 


1369 


1379 








1381 


1383- 


1384 


1386- 


1387 


1389 








1394 


1397 


140S 


1423 


1425- 


1428 








1431 


1437 


1446 


1448 


1461 


1466 








1470 


1472 


1474 


1482 


1492 


1506 








1528 


1537 


1546 


1549 


1591 


1598 








1600 


1603- 


1604 


L606 


1627 


1636 
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SEQ ID NOS: 



Tissue Origin 



RNA source 



Hyseq 
Library Name 



163B 1647-1649 1651 1658-1659 
1664 1676-1677 16B0-1681 1687- 
1680 1699 1711 1715-1716 1726 
1728 1737 1740 1746 1748 1752 
1756 1758 1777 1779 



leukocyte 



G1BCO 



1.UC001 



3-4 10-11 13 15-1B 20-21 24-25 
30-31 35-36 40 43-45 48 50-51 
54-58 60-63 68-69 75 79-80 82-83 
85 88-91 93-96 98 100 103-104 
107-108 112 116 119 123 125-128 
134-140 142 147-149 151 153 155 
157 162-163 167 169-172 174 177- 
179 1B6 190 192-199 203-207 210 
212-215 217-219 222-223 229 235- 
236 247 251 255-258 260 262 272 
274-277 280-201 285-286 297-301 
307-310 313-314 316-317 321 325- 
330 333-334 340-342 348-349 352 
354-358 370-371 380-385 387-388 
400 405 408-410 412 414-416 421- 
425 430-431 434-435 437 439 441- 
442 445-451 4S3-4S4 456 459 461- 
464 468-472 474-479 481 483-485 
487-491 496 499-501 503-504 509- 
513 516-519 522 526-527 529-531 
534 536-540 542 547-549 553-559 
566-567 571 574-577 579 582 584- 
586 589 593 595-597 601-602 604 
606-607 611-613 615-621 623 627- 
629 633 636-637 642 644-650 655 
659-660 662-665 667 669 674-675 
678 682-684 692-696 698 700 706 
70B 710 716-720 725-726 729-736 
738-739 743-745 749 751 753 756 
759 765-766 768 770-775 780 784- 
786 788-790 793 796 793 800 802- 
803 810-811 814 817 819 826 828- 
830 832 834-836 838 843 B45-860 
863-864 866-871 877-879 881-892 
894-896 898 902 904-914 916 919- 
925 927 930-932 935-936 941-942 
945 948-949^953 955-956 958 960- 
962 964 967 970-971 973 975 977 
9B5-990 992-993 995-996 999-1002 
1004-1009 1011 1014 1017-1019 
1022-1023 1025 1027 1029-1031 
1033-1036 1038 1041 1043 1047 
1050 1053-1054 1058-1059 1061- 
1062 1064 1068 1070 1072 1078 
1085-1086 1089-1091 1093 1097 
1106-1107 1110-1113 1115-1117 
1122-1123 1125 1129 1132-1133 
1135-1137 1140-1145 1152 1158 
1163 1168 1170-1174 1176-1178 
1180 1182-1183 1186 1195 1198- 
1200 1202 1205-1206 1211 1216 
1219-1221 1223-1227 1230-1236 
1238-1242 1247 1252 1254 1256 
1258 1261-1262 1264-1265 1269- 
1270 1272-1275 1277 1280-1284 
1287-1293 1299-1300 1306 1308 
1312-1313 1317-1320 1322 1324- 
1330 1333-1335 1339 1341 1343- 
1347 1349 1353-1357 1359-1361 
1365-1367 1369-1370 1373-1374 
1377 1379-1381 1386-1387 1394 
1400 1403 1409 1419 1423 1425- 
1428 1430-1431 1433-1434 1437- 
143B 1440-1442 1446-1448 1450 
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Tissue origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1464 
1478 
1501 
1516 
1527- 
1545- 
•1556 
1589 
•1602 
-1621 
1636 
1648- 
1662 
•1688 
1707- 
1723 
1741 
1755 
-1772 



1453 
1470- 
1488 
1506 
1521- 
1531 
1549 
1565 
1594 
1608 
1626 
1639 
1653 
1670 
1692 
1711 
1727 
1744 
1762 
1784 



14 5B- 
•1471 
1490- 
1509 
1522 
1534 
1550 
1567 
1596 
1611 
■1629 
1641 
•1655 
1675 
1696 
1716 
1733 
1748 
1765 
1786 



1459 

1474 

1493 

1512- 

1524- 

1538 

1553 

1575 

1598 

1614 

163?.- 

1644- 

1658- 

1679 

1700 

1717 

1737- 

1749 

1769 



1463- 
1477- 
1496- 
1513 
1525 
1541 
1555- 
1580 
1600- 
1620- 
1632 
1645 
1660 
1684 
1702 
1720 
1738 
17S2 
1771 



1468 

1482- 

1504 

1519 

1528 

1547 

1560. 

1591 

1606- 

1624 

1638- 

1650 

1659- 

1690- 

1709 

1725- 

1743- 

1760- 

1781- 



4~T5-36 44-43 Si $8-69 75 82 102 
119 139 154 179 197 244 280-281 
324 372 404 430-431 455 461 476- 
477 481 503 537-540 554 575-576 
581 589 608-609 621-622 624 630 
632 647 662-664 669 679 698 764 
773 775-777 802 848 851 856-857 
879 905-907 915 949 952 990 992 
1002 1113 1119 1170 1183 1216 
1236-1237 1241 1275 1346 1353 
1357 1359 1377 1506 1515 1534 
1553 1S91 1600 1613-1614 1621 
1628 1670 1676-1677 1691-1692 

1699 1733 1738 1772 

25 35-3* 43 80 104 126 128 150 
163 166 188-189 197 210 215 220 
271 277 280-281 310 317 336-338 
345 351 372 380-381 383 387 412 
415-416 430 445 448 454 456 467 
481 490 499 503 526 528 546 548 
567 575-576 5B8 601 613 615 647 
660 665 734-735 737 759 778~ 7B7 
790 800 832 845 856 859 869 878 
883 B87 905 914 932 934 958 976 
985 990 992 999-1000 1025 1031 
1038 1050 1055 1068 1074 1088 
1099-1102 1107 1136-1138 1149 
1156 1163 1172 1190 1195 1200 
1214-1215 1217 1226-1227 1235 
1238-1239 1244 1253 1278*1230 
1293 1311 1320 1330 1334-1335 
1345 1355 1367 1386-1387 1394 
1403 1406 1414 1423 1437 1442 
1465 1521 1529 1536 1539 1541 
1547-1548 1582 1620 1626 1631 
1638 1647 1653 1660 1667 1669- 
1670 1680-1681 1696 1704 1715 
1724-1725 1731-1732 1750 1760- 
1761 



leukocyte 



Clontech 



LUC003 



melanoma from 
cell line ATCC 
#CRL 1424 



Clontech 



MBL004 



mammary gland 



Invitrogen 



MMG001 



5-8 10 12 14 
33-39 42-43 
71 73-74 79 
106 108 112 
146 148 150 
166 170-172 
188-190 194 
222 224 227- 
251 253-254 
271 276-277 



-18 20-21 24 
52 55-58 60 
80 82 89 98 
123 128 133 
152 154 158- 
174 176 178 
198 201-206 
228 231 233- 
256 261-263 
279-281 284- 



25 29 
64 68-69 
100 103 
137 144- 
159 165- 
181-185 
210 217- 
237 247 
266-267 
286 288 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ ID NOS: 








Library Name 




















290 2 


97 299 301 


304 


309-3 


12 31B 








320-3 


21 323-325 


327- 


329 331-332 








334 339 34 


1 344-345 


348 350 356 








359-360 362-363 


368 


371 376 379- 








303 380 390 393 


-395 


397-398 405 








408 412 414-415 


4 23 


430 434-437 








441-444 44 


8 451 


-455 


462-464 474 








476 479 482 485 


-486 


488 490 494- 








495 498 503 506 


509- 


512 516-517 








519-520 522 527 


529 


534 537-541 








.547 5 


49 554 557 


562 


572-574 587 








589-591 597 602 


607 


618 623 626- 








629 632 634-640 


644 


647-648 650- 








652 655 657-658 


660 


665 6 


67 669- 








672 6 


74-676 679 


682 


688 695-696 








706-707 710 713 


717 


720 722-730 








732-734 736 738 


743 


747-748 750 








755 7 


59 761 766 


770 


780 7 


84 706- 








789 794 003 006 


-807 


809 814 817- 








B22 B27-829 837 


842 


854-858 863- 








•864 866 869-870 


872 


878 881 889 








893-900 904 906 


-907 


911 916 919 








921-923 926 935 


-937 


946 948-949 








953-954 957 960 


-961 


963 965-966 








970 977-978 984 


-989 


993-997 








1000- 


1001 


1005- 


10C6 


1008 


1013- 








1014 


1016- 


1017 


1023 


1025 


1027 








1032- 


1033 


1036 


1039 


1043 


1045 








10S5 


1057- 


1058 


1063 


1068- 


■1075 








1077- 


1078 


1085 


1087 


1089-1091 








1095- 


1102 


1107- 


1108 


1112- 


•1119 








1121- 


1123 


1131- 


1133 


1136-1137 








1139- 


1142 


1144- 


1145 


1148- 


-1149 








1153 


1159 


1167 


1170 


1172-1173 








1183-1185 


1190- 


1192 


1196-1199 








1207-120B 


1212 


1216-1218 


1222- 








1223 


1225 


1231 


1234 


1240-1241 








1247 


1253-1254 


1258-1259 


1261- 








1262 


1270- 


-1280 


1283 


1285- 


-1286 








1298 


1307 


1314 


1316- 


•1320 


1323- 








1325 


1330 


1334- 


1335 


13 4 2^ 


-1345 








1349- 


•1352 


1354- 


1355 


1359 


1369- 








1370 


1377 


1379 


1381 


1383-1384 








1389 


1405 


1414 


1419 


1421- 


-1423 








1425- 


-1426 


1428- 


1429 


1431 


1434- 








1437 


1439 


1448- 


1449 


1454 


1457 








1460-1464 


1466 


1471 


1480 


-1483 








1487 


1489-1491 


14 93 


1505 


1507 








1512 


1519 


1526- 


1526 


1532 


1534 








1536 


1539 


1542 


1547 


1549 


-1550 








1554 


1561- 


-1562 


1564 


1567 


1572 








1S76-1579 


1581- 


1582 


1587 


-1588 








1592 


1594 


1596- 


1597 


1601 


-1602 








1607 


-1608 


1610 


1612 


-1616 


1618 








1621-1622 


1625- 


1626 


1631 


1635- 








1636 


1641 


1643- 


1644 


1647 


1650 








1652 


1654-1655 


1657 


-1658 


1660 








1662 


1664 


-1666 


1669 


-1671 


1673- . 








1674 


1676 


-1677 


1680 


-1685 


1689- 








1692 


1701 


1706 


1713 


-1715 


1719- 








1720 


1723 


-1728 


1730 


-1732 


1738 








1740 


1742 


-1744 


1746 


-1747 


1749 








1751 


1753 


1760- 


-1762 


1765 


-1768 








1771 


1774 


1776-1777 


1779 


1783- 








1784 


1786 










induced neuron 


Strategene 


NTD001 


29 35-36 


80 116 123 


156 


163 181 


cells 




214 


230 280-281 284 


-285 


307 321 






330 


340 358 371 375 


377 


380 382 








422 424 4 


92 497 532 


-S33 


542 546 
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Tissue Origin j RNA Source 



retinoid acid 

induced 
neuronal cells 



Hyseq 
Library Name 



Strategene 



NTR0 01 



SEQ ID NOS: 



549 566 595 €12 £4*-647 654" 

734 775-778 780 732 799 821 826 
856 858 875 936 953 985 990 992 
1041-1043 1055 1072 1104 1193- 
1194 1206 1223 1246 1253 1274 
1288-1289 1291 1294 1311 1320 
1349 1359 1412 1423 1485 1620 
1623 164 5 1684 1705 1715 1751 
5-B 78 2^8- 269 27? 383 431 506 " 
623 677 731 999-1000 1199 1425- 
1426 1547 



Strategene 



NTU001 



pituitary- 
gland 



CI on tech 



PIT004 



29 65-66 SO 82 110 119 146 157 " 
166 174 181-185 198 227-228 253 
284 309 325 332 334 336-338 375 
391 393 406 414-416 454 465-466 
470 4B8 503 506 510-512 519 537- 
540 572-S74 597 602 607 623 647 
661 700 702 716 743 771 792 858 
904 948 954 977 1000 1005-1006 
102S 1064 1068 1122 1146 1185 
1219 1226 1234 1246 1271 12B3 
1295-1296 1311 1317-1320 1329- 
1330 1350 1355 1365-1366 1378 
1383-1384 1400 1412 1445 1505 
1539 1547 1578 1647 1656 1683 
1690 1738 1749 1783-1784 



311 314 379 408 419 430 454 105* " 
1095-1096 1272-1273 1312 1320 
1378 1652 1671 1720 1725 1736 
1741 1755 



CI on tech 



PLA003 



5-8 124 208 277 370 843 906-907 
1280 1317-1319 1359 1609 1621 
1737 



PRT001 



9 46 57 71 107 147 171 177 197 

201 229 231 242-243 274 280-281 
307 310 317 330 358 373 382-383 
400 430 434-436 461-462 469 477 
489 497 500 505-506 513 521 526 
531-533 547 618 649 657-658 662- 
664 710 729 767 771 789 820 861 
.871 874 B90-891 905 938 945 963- 
964 9B8-989 1002 1025 1033 1045 
1061 1095-1096 1112 1125 1142 
1196 1198 1202 1232-1233 1241 
1258 1272-1273 1287 1295 1313 
1333 1341 1344 1349 1360 
1363 1367 1437 1442 1447 



1478-1479 1482 1489 1513 
1527 1531 1536 1598-1599 



1362- 
1475 
1517 
1628 



1636 1657 1680-1681 1687-1688 
1717 1738 1743-1744 



Invxtrogen 



REC001 



17-18 29 33 62-63 71 73-74 83 86 
113 126 146 153 158 167-169 195 
200 206 261 309 312 341 344 368 
373 388 395 408 414 420 430 441- 
442 446 448 464 468 483 517 537- 
540 547 567 585 589 602 623 628- 
629 632 645-647 651 657-658 669 
717-719 721 725-726 730 748 750 
756 762-763 766 770 774 790 819 
825 843 849 851 881 903 909 948- 
949 960 986 996 1020 1023 1033- 
1034 1064 1067 1070 1075 1086 
1108-1109 1113 1130 1139 1153 
1159 1172 1178 1185 1187-1189 
1205 1220 1225 1240 1244 1271 
1317-1320 1323 1334-133S 1350- 
1351 1355 1369 1373 1375 1425- 
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Tissue Origin 


RNA Source 


Hyseq 


SEQ ID NOS : 






Library Name 










1 > *> ^ 1/1^ n/lOO i j rn till 1 x 

1436 1439 1459 14 74 1477 








14B2 lo4b IbUV-lboo 1532 1396 








J.01U Xb^^ ib£ / lb44 lbDO JLoq< 








ibos-lobo 1669 1675-1677 1749 








l / a o 


Hi "4 J 

salivary glana 


Clontech 


SAL001 


10 55 97 103 110 140 149 152 15B 








19B 217-218 242-243 256 301 308 








312 321 333 ibl 354 360 410 437 








448 473 487 494 496 501 535 555 








569-570 572-573 590-591 624 636 








651 759 762 764 768 771 788 800 








OU7 D«D 04 0 ODD O f 7 7W0 ?U ( 923 








533 9G3 "LOlG ifl?c "l f\A C 








1055 1066 1103 1150 1172 1181 
















1315 1320 1333 1336-1337 1346 








1 3<S 0 137*1 117Q ">d">4 1 4 d Q 








1474 1482 1492 1494 1498 1511 
















1627 163b looZ-lbDD 1658 1665 








lb/l-lb/2 lbyi-lfayz 


salivary gland 


Clontecn 


SALS03 


158 326 1423 1463-1454 


skin 


ATCC 


SFB001 


1320 1400 


fibroblast 








skin 


ATCC 


SFB002 


262 736 1025 1253 


fibroblast 








skin 


ATCC 


SFB003 


709 1119 1350 1631 1653 


fibroblast 








small 


Clontech 


SIN001 


25 142 146-147 151 155 198 203 


intestine 






244 260 271 280-281 286 288 298 








301-302 308 312 334 340 371 398 








408 412 414 416 423 425-427 430 








434-435 445 452 454 478 S03 516 








519 521 523 543 547 549 555 559 








563 569-570 585 592 604 611 626 








628-629 632 650 659 681 710 714 








718 750 764 780 798 829 842 857 








859 666 887 892 894-895 901 904 








906-907 912 919 935 997-998 1000 








1007-1008 1026-1028 1044 1055 






r 


1089 1097 1116-1117 1131 ,1148 








1169 1199 1219 1234 1247 1264 








1279 1316 1320 1326 1341 1343 








1349 1351 1374 1387 1398 1400 




• 




1403 1407 1423 1428 1468 1498 








1501 1521 1550 1556 1585 1597 








1636 1638-1639 1645 1653 1656 








1662 1671 1575 1684 1691-1692 | 








1704 1711 1717 1719 1722 1725- 








1726 1729 1733-1734 1743-1744 








1762 1767 1780 1785 




Clontecn 




id 2 If - 21 OZ o*k lul lib 134 14 a 


muscle 






151 153 166 225-226 258 274 277 








*>flQ lift 3d A1 O 6.1 A. AAD AGO < 
zoj jz; joi <ti« mi* l *x^t * ** u 








>tCa A1f\ A D ft Crtl Cf\A O"? CAt\ CA*1 








660 673-675 715 773 780 786 830 








one oo*5 ocn oe*4 aoo son ao*> i non 








1047 1063 1115-1117 1121 1134 








1228 1268 1284 1298 1321 1329 








1336-1337 1343 1409 1413-1414 








1S09 1599 1624 1644 1653 1712 


skeletal 


Clontech 


SKM002 


168 1683 1712 


muscle 








skeletal 


Clontech 


SKMs03 


235-236 1409 


muscle 








skeletal 


Clontech 


SKM804 


235-236 


muscle 








spinal cord 


Clontech 


SPCtiOl 


4 9 11 17 30-31 35-36 43 46 60 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult spleen 



Clontech 



SPLeOl 



82 85 92 94 108 110 
167 198 204-205 210 
259 277 280-281 300 
317 372 379 387 392 
430 433 448 467 473 
509 513 519 524 526 
547 549 551 559 567 
607 616-617 623 625 
652 657-658 670-671 
682 709 711 715 719 
749-750 753 775-777 
809 020 832 834-836 
B55 858 861 864 871- 
898 906-908 917 919 
944 970 985 990 992- 
1039 1053 1059 1065 
1077 1082 1085 2097 
1116-1117 1128 1134 
1174 1192-1194 1215 
1243 1283 1294 1307 
1323 1327 1330 1350 
13S6 1359 1368 1375 
1407 1423 1429 1437 
1454 1470 1482 1492 
1511 1529 1538 1548- 
1571 1578 1598 1600 
1627 1630 1639 1646 
1670 1686 1695 1740 
1771 



116 139 157 
215 229 256 
-302 304 315 
419 426-427 
487 489 506 
537-540 543 
569-570 593 
637 649-6S0 
673 679 681- 
728-729 734 
782 789 752 
847-849 854- 
-872 875 884 
924 934 942 
■993 998 1013 
1072 1075 
1103 1109 
1151 1170 
1225 1241 
1312 1320 
1353-1354 
1400 1406- 
1443 1448 
1501 1508 
1549 1565 
1614 1625 
1651-1652 
17S1 1755 



117 312 326 348 424 426-427 431 
845 866 1320 1330 1333 1344 
1355-1357 1371 1387 1397 1446 
1538 1579 1669 1686 1739 1767 



Clontech 



STO001 



10 15-16 61 68-69 100 127 149 
197 201 227-228 231 249 273 280- 
281 287 291-292 302 312 358 362 
426-427 430 446 462 475 479 535 
597 620 630 651 662-664 722 739 
780 782 785 846 919 960 964 966- 
967 976 1008 1012 1032 1042 1063 
1071 1135 1170 1208 1234-1235 
1259 1277 1280-1281 1322 1349" 
1359 1369 1449 1468 1474 1478 
1487 1493 1498 1557-1559 1622 
1634 1651 1653 1729 



Clontech 



THA002 



thymus 



9 11 25 85 87 112 137 14* 180 
190 198 206 210 212-213 235-236 
239 261 268-269 279 290 301 325 
333-334 341 351 356 364-365 379 
388 333 396 419-420 441-442 458 
477 483 508 525 531 549 567 606 
608-609 647 6B1 715 725-727 736 
774 782 784 794 827 883 890-891 
899-900 961 997 999-1001 1004 
1034 1055 1097 1129 1144-1145 
1150-1151 1157 1172-1173 1177 
1193-1194 1208 1220 1249 1280 
1305 134S 13S5 1369 1434-1435 
1440-1441 1454 1496 1546 1549 
1562 1572 1578 1590 1594 1613- 
1614 1640 1651-1652 1671 1687- 
1688 1703 1743-1744 1746-1747 
17S3 



Clontech 



"thmooi 



44-45 54 £7- 
126 134 153 
243 258 274 
327 330 333 
30 445 465- 
493 503 506 



58 62-64 79 104 123' 
193 212-213 218 242- 
277 279 297 301 307 
342 351 358 371 410 
466 468 471 483 487 
509 517 526 535 537- 
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Tissue Origan 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



thymus 



Clontech 



THMC02 



540 546 548 554 S67 S84 586 590- 
591 604 612 621 638-640 645-647 
649 656 660 665 670 698 710 720 
728 735 739 746 759 762 766-767 
775-777 780 784-785 800 802 809 
824 826 828 845 851 858-859 864 
866 870-871 878 884 887 892 899- 
900 927 930-931 967 983 986 990 
992 999 1014 1029-1030 1033 1059 
1066 1073 1103 1107 1113 1116- 
1117 1119 1140-1142 1158 1163 
1172 1177 1195 1206 1209 1213 
1216 1218-1219 1221-1222 1227 
1271 1277 1282 1320 1329 1349 
1367 1369 1383-1384 1417 1419 
1423 1425-1427 1448 1477 1488 
1493 1S36 1554 1620 1644 1646 
1549 1654-1655 1661-1662 1669- 
1670 1674 1676-1677 1685-1688 
1707 1711 1731-1732 1737 



5-9 15-21 25 33 35-36^ 43-45 48 

50-51 54-55 60 75 S3 87 89 93 
98-100 102 105 112 117 135-137 
141 143 146 157 167 169 192 196 
211 217-219 222 224 229 233 235- 
236 240-241 244 251-252 256 261- 
262 268-269 286 288 290 295 297 
301-302 309-310 315-317 321 324 
327 334 342 350 352-353 360 370- 
373 382 384 400 403 410 414-416 
424 430-431 436 445 4S4-4S6 461 
464-467 470 472 474-476 483 488 
497 500 504 506 513 516 519-520 
524 526 530-531 534 537-540 549 
554-555 565-566 569-570 572-573 
575-577 586-587 595 603-604 606 
612 630-632 634 636 647 650 657- 
660 666-667 669 673-675 678 698 
700 703 708 720 725-726 731 738- 
739 743-744 750-753 757 759 763- 
765 767 772-779 787 789-790 798 
800 810 823 829 834-836 841 848 
854-856 859 861 864 870-871 881 
890-891 898 908-909 913 928 933 
941 949 958 961 963 967 969 975 
981 986 988-990 992 999 1007- 
1008 1014 1016 1039 1041 1073- 
1074 1079 1089 1097 1109 1114- 
1117 1122 1131 1140-1141 1144- 
1145 1163 1172 1175-1177 1186 
1196 1198 1206 1211 1216 1220 
1223 1227 1234-1243 1261-1262 
1267 1271 1280-1281 1284 1290 
1308 1317-1320 1322 1324-1325 
1327 1330 1334-1335 1339 1346 
1350-1351 1355 1357 1360 1370 
1374 1377-1379 1386 1389-1390 
1392 1397 1400 1402 1406-1407 
1417 1423 1425-1427 1440-1441 
1466 1474 1477 1483 1493 1498 
1504 1506 1525 1536 154S 1549 
1566 1594 1598-1600 1608 1611 
1614 1621 1623 1625 1632 1639 
1641 1644 1647 1649 1653-1656 
1658 1662-1663 1671 1673 1678- 
1681 1686-1688 1693 1705 1707 
1711 1717-1718 1726-1727 1731- 
1733 1737-1738 1743-1745 1758- 
1761 1771-1772 1779 1786 
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Tissue Origin 



thyroid gland 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



Clontech 



THRO 01 



trachea 



Clontech 



TRC001 



4 9-10 20-21 37-39 48 50-51 54- 
57 60-61 65-66 71 83 94-96 98- 
100 102 104 110 112 115-117 119 
123 127 133 136-137 140 149 152- 
153 155.158 163-164 168-169 171 
186 190-192 197 201-203 219-220 
229 233-237 246-247 253 256 258 
262 265-266 268-269 277 2^80-281 
284-286 288-289 298-299 302 309- 
311 317 321 326 332 335 341-342 
344 348 350 354 358-359 363 368 
371-373 382-383 385 394 398 400- 
401 411 414-415 421 424 430-431 
433-436 443-446 450-452 454-455 
458 472-474 476-478 482 484-485 
487-488 490-494 496-497 500-501 
503-504 506 509-513 516-517 519 
524 526-527 529 535-540 547 549 
562 564 569-570 575-576 588 594- 
595 601-602 604 606 610 612 615- 
617 619-623 628-630 634-635 642 
647 649-651 660 662-665 668 670 
681 690-694 696 698 700 709 721 
727-729 732 734 738 740-741 743 
745 750 759 761 763 765 770 773 
780 785 795-796 798 802 804 B23- 
824 826 828 833 838 841-845 847 
849 857-860 867 B74-875 878 880- 
881 887-888 890-892 894-895 898 
908 910-911 913-914 922-923 926- 
927 929 932-934 937 939 941-942 
948 9S3 957 961 963-964 966 978- 
979 981-982 987 990 992 1001 
1004-1006 1010 1014 1020 1024 
1033 1038-1039 1044 1047 1050 
1052-1054 1056 1058 1068 1070- 
1071 1077-1079 1088 1094-1097 
1105-1106 1112-1113 1116-1117 
1124 1126 1128-1129 1131 1134 
1136-1137 1142-1143 1146-1147 
1149-1150 1156 1161-1164 1167 
1170-1173 1177-1181 1190 1192* 
1197 1200 1204 1208-1209 1214 
1217 1219 1222 1230 1232-1233 
1235 1241 1245 1247 1254 1257- 
1258 1260 1262 1271-1273 1283 
1286-1289 1299 1306 1314 1320 
1330-1332 1334-1335 1342 1345 
1349 1365-1367 1370-1372 1374 
1381 1394 1407 1419 1428*1436- 
1437 1440-1441 1443 1446-1449 
1454 1459 1461-1462 1468 1470- 
1471 1475 1477 1479 1482 1491 
1497-1498 1504-1505 1507 1513 
1522 1524-1526 1528 1531 1534 
1536-1537 1548 1S50 1553 1555- 
1559 1562 1567 1578 1590-1591 
1597 1599-1601 1612 1614 1616 
1619-1620 1622 1624-1626 1628 
1631-1632 1634 1636 1639 1644- 
1645 1648 1651 1653-1656 1658 
1660 1662-1663 1667 1669 1671 
1675 1678-1681 1683-1686 1689 
1691-1692 1703 1709-1711 1717 
1724-1726 1729 1734 1737-1738 
1740 1743-1744 1749 1753 1759- 
1761 1770 1777 1786 



9 29-31 46 48 B7 104 107 110 135 
158 222 262 266 286 301 318 331 
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Tissue Origin 



RNA Source 



H/seq 
Library Name 



SEQ ID HQS: 



Clontech 



352 372 37 7 384 414 424 44^-446 " 
454 472 474 491 496 560 579 588 
593 597 607 S12 626 6B1 702 719 
810 859 866 B78 894-895 912 916 
922 932 935 1046 1075 1080 1099- 
1102 1113 1208 1215 1232-1233 
1237 1281 1312 1385 1387 1405 
1414 1424 1430 1437 1447 1505 
1569 1579 1586 1600 1641 1653 
1667 1671 1676-1677 1683 1691- 
1692 1711 1717 1726 1772 



uterus 



UTR001 



17 19 25 41 46 57-58 61 89 104" 
108 139 152 174 198 200-201 206 
263-265 274 290 387 408 420 438 
446 448 452 473 491 493 499 503 
506 513 519 522 526 530 542-543 
560 601 610 632 659 665 720 751 
773 780 833 845 857 872 877 912 
929 934 937 996 1009 1011 1018 
1050 1075 1107 1124 1170 1219 
1256 1279 1287 1310 1320 1323 
1343-1344 1375 1437 1451-1452 
1478 1481 1498 1519 1521 1536 
1552 1579 1597 1602 1606 1620 
1626-1627 1649 1652 1661 1670 
1719 1722-1723 



TRADOCS : 1416191.1 (%CQN0 1 1 . DOC) 
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TABLE 2 
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SEQ 
ID 
NO: 


ACCHSSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SC0R2 


IDENTITY 


1 


Y41736 


Homo 
sapiens 


Human PR01114 protein 
sequence . 


13 98 


100 


2 


Y66656 


Homo 
sapiens 


Membrane -bound protein 
PR0943 . 


2389 


99 


3 


AF113136 


Homo sapiens 


IL-1 receptor-associated- 
kinase-M; IRAK-M 


3043 


100 


4 


AF017806 


Mus mus cuius 


Zn-15 transcription factor 


6351 


77 


5 


X02761 


Homo sapiens 


fibronectin precursor 


10535 


98 


6 


X02761 


Homo sapiens 


fibronectir. precursor 


B990 


B9 


B 


X02761 


Homo sapiens 


fibronectin precursor 


12564 


99 


9 


" AJ011679 


Homo sapiens 


Rab6 GTPase activating 
protein, GAPCenA 


5251 


99 


10 


WB8501 


Homo sapiens 


Human stomach carcinoma clone 
HPl0415-encoded protein. 


2381 


100 


11 


AP117754 


Homo sapiens 


thyroid hormone receptor- 
associated protein complex 
component TRAP240 


11336 


98 


12 


Z97630 


Homo sapiens 


dJ466H1.4 (novel protein 
similar to ANK3 (ankyxin 3, 
node of Ranvier (ankyrin 
G))) 


896 


100 


"13 


Y58620 


Homo sapiens 


Protein regulating gene 
expression PRGE-13. 


1894 


98 


14 


AF213457 


Homo 
sapiens 


triggering receptor expressed 
on myeloid cells 2 


'123B 


100 


16 


AF2334S3 


Homo sapieno 


RACK- like protein PRKCBP1 


3124 


99 


17 


AF201303 


Homo sapiens 


dhfr oribeta- binding protein 
RIP60 


-3130 


98 


18 


AF064205 


Homo sapiens 


dynactin 1 pi 50 isoform 


6377 


100 


19 


U00059 - 


Saccharomyce 
s cerevisiae 


Yhrl21vp 


174 


26 


20 


AB032903 


Homo sapiens 


guanos ine monophosphate 
reductase isolog 


1801 


99 


21 


AB032903 


Homo sapiens 


guanosine monophosphate 
reductase isolog 


1485 


99 


22 


AF140507 


Homo sapiens 


Ca2 + / ca Imodul in- dependent 
protein kinase kinase beta 


3083- 


"99 


23 


AF140507 


Homo sapiens 


Ca2+/calmodul in-dependent 
protein kinase kinase beta 


2300 


99 


24 


AJ289131 


Homo sapiens 


chondroitin 4-0- 
sulfotransf erase 


2211 


"99 


2S 


U334^0- • 


Homo 
sapiens 


DNA- directed RNA polymerase 
I, largest subunit 


8777 


98 


26 


Y444S8 


Homo sapiens 


ACRP30R2 variant protein. 


1387 


100 


27 


U43 701 


Homo sapiens 


ribosomai protein L23a 


791 


100 


2B - 


U02032 


Homo sapiens 


ribosomal protein L23a 


767 


97 


29 


Y41324 


Homo sapiens 


Human secreted protein 
encoded by gene 17 clone 
HNFIY77 . 


1083 


99 


30 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2. 


715 


90 


31 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2. 


£u 


82 


32 


AF231917 


Homo sapiens 


long- chain 2 -hydroxy acid 
oxidase HAOX2 


1811 


ioo 


33 


Z29481 


Homo sapiens 


3-hydroxyanthranilic acid 
di oxygenase 


1507 


99 


34 


AB001451 






2869 


100 


35 


Y0Q644 


Homo sapiens 


precursor polypeptide {AA -34 
to 287) 


1667 


99 


36 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1104 


98 


"37 * " 


Y78795 


Homo sapiens 


Human anti2uai-2 IAZ-2) amino 
acid sequence. 


3SB6 


78 


38 


Y7B795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence . 


4726 


99" ■ ' 
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SEQ 
ID 
NO: 


ACCESSION" 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


39 


Y78795 


Homo sapiens 


Human antizuai-2 (A2-2) amino 
acid sequence . 


3556 


77 


40 


U93121 


Homo sapiens 


M-phase phosphoprotein-l 


3747 


100 


41 


Y42750 


Homo sapiens 


Human calcium binding protein 
1 (CaBP-1) . 


795 


100 


42 


AF282626 


Homo sapiens 


latexin 


11B9 


100 


43 


G02150 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6231. 


384 


94 


44 


U19617 


Mus musculus 


Elf-1 


2724 


8B 


45 


U19617 


Mus musculus 


Elf-1 


2062 


66 


46 


AF100758 


Homo sapiens 


osteoinductive factor OIF 


1538 


100 


47 


YB7591 


Homo sapiens 


Human SPROUTY-i protein, SEQ 
ID NO:24. 


1737 


99 


49 


X04145 


Homo sapiens 


T3 gamma precursor (aa -22 to 
160) 


942 


99 


51 


X63S47 


Homo sapiens 


oncogene 


5845 


99 


52 


M94043 


Rattus 
norvegicus 


rab-related GTP -binding 
protein 


1089 


96 


53 


L317B3 


Mus musculus 


uridine kinase 


917 


"71 ' 


54 


X83973 


Homo sapiens 


transcription factor 


4486 


98 


55 


AF224741 


Homo sapiens 


chloride channel protein 7 


4128 


99 




W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


1491 


100 


57 


ZS0907 


Homo sapiens 


Human TBC-1 cDNA from second 
transcript. 


4824 


100 


58 


D79994 


Homo sapiens 


similar to ankyrin of 
Chroma tium vinosum. 


60B9 




59 


D79994 


Homo sapiens 


similar to ankyrin of 
Chroma ti urn vinosum. 


4014 


91 


60 


Y59738 


Homo sapiens 


Human normal ovarian tissue 
derived protein 15 . 


601 


100 


61 


AB031069 


Homo sapiens 


protein containing cxxc 
domain 1 


13 90 


100 


62 




Homo 
sapiens 


Membrane -bound protein 
PR0783. 


2492 


99 


63 


Y66660 


Homo 
sapiens 


Membrane -bound protein 
PR0783. 


1709 


99 


64 


S70011 


Rattus sp. 


tricarboxylate carrier 


895 


55 


65 


AF13951B 


Rattus 
norvegicus 


A-kinase anchor protein 


178 


24 


66 


W29666 


Homo sapiens 


Homo sapiens DH1308 1 clone 
secreted protein. 


157 


30 


67 


AJ245738 


Homo sapiens 


claudin-15 


1206 


100 


6B 


AE09913B 


Rattus 
norvegicus 


GLUT 4 vesicle protein 


4183 


87 


69 


AF099138 


Rattus 
norvegicus 


GLUT 4 vesicle protein 


4906 


86 


70 ■ 


Z82059 


Caenorhabdit 
is elegans 


Similarity to Drosophila ring 
canal protein comes from 
this gene 


1285 


44 


71 


AE224278 


Homo sapiens 


PMEPAi protein 


1282 


100 


72 


AF126426 


Homo sapiens 


neurotrimin 


1809 


100 


73 


Y41652 


Homo 
sapiens 


Human MEK2 protein sequence. 


2065 


99 


74 


Y41652 


Homo 
sapiens 


Human MEK2 protein seguence. 


1207 


100 


75 


AF1B8622 


Mus musculus 


selectively expressed in 
embryonic epithelia protein- l 


1485 


74 


76 


AEO0O406 


Escherichia 
coli 


putative DNA topoisomerase 


9S0 


100 


77 


X99302 


Homo sapiens 


Popl 


655 


100 


78 


AL136538 


Schizosaccha 

romyces 

pombe 


similarity to S. cerevisiae 
ktil2 protein 


210 


31 


79 


AF129756 


Homo sapiens 


G4 


1554 


99 
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SEQ 
ID 
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ACCESSION 
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DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


BO 

A 1 


AL096768 


Homo sapiens 


dJ858B16.2 
(phospha t i dyl s er i ne 
decarboxylase (PSSC, EC 
4.1.1.65)) 


2033 


100 


OX 




Homo sapiens 


dJ858B16.2 


1220 


96 








(phosphatidylserine 
decarboxylase (PSSC, EC 
4.1.1.65)) 


82 




Homo sapiens 


1-8D 


677 


98 


83 


AC0 05594 


Homo sapiens 


R26984 1 


2700 


98 


84 


X73113 


Homo sapiens 


Cast MyBP-C 


5959 


99 


85 


AF097330 


Homo sapiens 


" HI chloride channel; p64Hl; 

CLIC4 




99 


86 


AB018423 


Mus musculus 






78 


87 


AF272151 


Homo sapiens 


adaptor protein CIKS 


3084 


99 


88 


AF19S329 


Homo 
sapiens 


triggering receptor expressed" 
on monocytes 1 


1214 


100 


89 
90 


AB016879 
AJ133721 


Arabidopsis 
thaliana 

Mus musculus 


contains similarity to pre- 
mRNA splicing 
f actor~gene_id : MRB17 . 2 
nomeodomain protein 


634 
"654" 


36 


91 
92 


AJ242864 
A61971 


Mus musculus 
unidentified 


phtf protein 
MCSP 


619 


57' 
"61 


93 


Y99365 


Homo sapiens 


Human PRO1250 (UNQ633) amino 
acid sequence SEQ ID NO: 86. 


11676 
3890 


99 
100 


94 


Y87231 


Homo sapiens 


Human signal peptide 
containing protein HSPP-8 
SEQ ID NO: 8. 


1031 


100 


"95 


AP227741 


Rattus 
norvegicus 


protein kinase WNKl 


2428 


95 


96 


AP227741 


Rattus 
norvegicus 


protein kinase WNkl 


1961 


94 


97 


Y92513 


Homo sapiens 


Human OXRE-10. 


1626 


100 


98 


AL0213S6 j 


Homo sapiens 


CICK0721Q.3 (Kinesin related 
protein) 


3423 


100 


99 


AC005733 


Homo sapiens 


R33083 1 


19^4 


99 


10D 


Y95293 


Homo sapiens 


Human GEF containing NEK- like 
kinase substrate sGNK. 


4092 


99 


101 

*» 


AI*118501 


Homo sapiens 


dJiisiN16.l (a novel protein 
(translation of the cDNA r f 
DKFZpS 6 6 AO 94 6 , Em:AL050069) ) 


1509 


100 

•* 


102 


AJ006267 


Homo sapiens 


ClpX-like protein 


3233 


100 


103 
104 


AF100753 
AB015982 


Homo sapiens 
Homo sapiens 


ancient ubiquitous 46 kDa 

protein AUP1 

serine/ threonine kinase 


2042 
4718 


96 ~~ 
100 


105 
106 


AF151074 
M35522 


Homo sapiens 
Canis 
familiar is 


GTP- binding protein (rab7) 


831 
354 


64 
50 


107 * 


R99800 

1\ Ol *5 C C ^ "1 


Homo sapiens 


NTII-1 nerve protein, " " 
facilitates regeneration of 
nerve cells. 


2337 


93 


108 




Homo sapiens - " 


NADH- cytochrome b5 reductase 
isoform 


1290 


93 


109 
110 


AP064729 


Homo sapiens 
Homo sapiens 


F23269 2 

RAN binding protein 16 


3369 


99 


ill 

-L-L J. 


X52425 


Homo sapiens 


interleukin 4 receptor 


3285 
4496 


100 
100 


112 ■ ' 
111 


Y41686 


Homo 
sapiens 


Human PR0274 protein 


2285 


100 


Til 


W15506 


Homo sapiens 


Mitogen activating protein 
kinase ERKi . 


1991 


100 


IIS ' 


Y71071 


Homo sapiens 


Human membrane transport 
protein, MTRP-16. 


1190 


99 




A£,049548 


Homo sapiens 


dJ398G3.1 (ortholog of rat 
CPG2) 


3497 


99 


116 
117 


AF189817 
W3 0891 


Mus musculus 
Homo 


evectin-2 

Suman cytostatin ill protein. 


1124 
715 


90 
99 
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WAT3RMAN 
SCORE 


* 

IDENTITY 






sapiens 








118 


AF11661B 


Homo sapiens 


PRO103 8 


1469 


100 


119 


Y08915 


Homo sapiens 


alpha 4 protein 


1748 


100 


12C 


AF098070 


Drosophila 
melanogascer 


Lisi nomolog 


192 


""39 


121 


AF052432 


Homo sapiens 


katanin p80 sub unit 


181 


37 


122 


Y70743 


Homo sapiens 


PSEQ-1 protein encoded by 
NSEQ gene associated with 
matrix remodelling. 


2637 


98 


123 


AF083246 


Homo sapiens 


HSPC028 


2132 


100 


124 


Y27096 


Homo sapiens 


Human viral receptor protein 
(ACVRP) . 


833 


99 


125 


M63109 


lieishmania 
major 


glycoprotein 96-92 


172 


27 


126 


U75467 


Drosophila 
melanogaster 


Atu 


93 5 


3S" 


127 


Z6B220 


Caenorhabdit 
is elegans 


Similarity to Human ADP/ATP 
carrier protein 


438 


43 


128 


AF095927 


Rattus 
norvegicus 


protein phosphatase 2C 


1927 


94 


129 


W92958 


Homo sapiens 


Human zsig44 protein. 


4^3 


100 


130 


AF115391 


Lactobacilli! 
s sakei 


ribofcinaoe. RbsK 


508 


37 


131 


X93498 


Homo sapiens 


2i-Glutamic Acid-Rich Protein 


1250 


100 


132 


X93498 


Homo sapiens 


21-Glutamic Acid-Rich Protein 


" 9i"£ 


"TH 


133 


W52B11 


Homo sapiens 


Human DBI/ACBP -like protein 
(DBIH) . 


705 


97 


134 


Y84444 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


323 0 


100 


135 


N69181 


Homo sapiens 


non- muscle myosin B 


189 


20 


136 


W74882 


Homo sapiens 


Human secreted protein 
encoded by gene 154 clone 
HE6FLB3. 


480 


inn 


137 


"W7820O 


Homo sapiens 


Human secreted protein 
encoded by gene 75 clone 
HHGAU81. 


855 


99 


13B ' 


AL033520" " 


Homo sapiens 


dJ349A12.i (similar Eo 
KIAA0701 protein) 


424 


39 


139 


AF020261 


Santalum 
album 


proline rich protein 


119 


30 


140 


X70394 


Homo sapiens 


zinc finger protein 


1634 


166" 


141 


Y06439 


Homo sapiens 


Human protease HUPM-8. 


936 


100 


142 


Z68493 


Caenorhabdit 
is elegans 


predicted using Genefinder 


365 


42 


143 


AB018107 


Arabldopsia 
thaliana 


ADP-ribosylation ractor-like 
protein 


596 


65 


144 


AF161483 


Homo sapiens 


HSPC134 


580 


51 


145 


Y84902 


Homo sapiens 


A. human proliferation and 
apoptosis related protein. 


480 


100 


146 


AB004906 


Ipomoea 
purpurea 


transposase 


146 


20 


147 


AC007357 


Arabidopsis 
thaliana 


F3F19.18 


647 


31 


148 


W75155 


Homo sapiens 


Human secreted protein . 
encoded by gene 41 clone 
HNTME13 . 


1494 


98 


1 A Q 


A D A L? f st £*r\ 

Ac Ub b490 


Homo sapiens 


cAMP-specific 
phosphodiesterase 8A 


3710 


99 


ISO 


YS8171 


Homo 
sapiens 


Human hydrolase homologue 
HHH-7. 


785 


99 


151 


U10397 


Saccharomyce 
s cerevisiae 


Yhrl46wp 


515 


53" " 


152 


X7347B 


Homo sapiens 


phosphotyrosyl phosphatase 
activator 


1719 


99 


153 


AL049697 


Homo sapiens 


dJ3B2lio.5.i (novel protein 


2034 J 99 
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SMITH- 
WATERMAN 
SCORE 


& 

IDENTITY 








similar to arginyl-tRNA) 








Ac lbi)bU2 


Homo sapiens 


cytochrome b5 reductase b5R.2 


1455 


99 ~ 1 


155 


X94703 


Homo sapiens 


rab28 


1126 


99 


156 


Y25716 


Homo sapiens 


Human secreted protein 
encoded from gene 6 . 


14 71 


100 


158 


W77404 " 


Homo sapiens 


Secreted salivary polypeptide 
2sig32 . 


937 


100 


159 


Y17248 


Homo sapiens 


Human protein kinase 
inhibit or- 2 (PKI-2) . 


383 


100 


160 


J04970 


Homo sapiens 


carboxypeptidase M precursor 


2395 


100 i 


161 


W54040 


Homo sapiens 


Human interferon-inducible 
protein, HIFI. 


484 


98 


162 


AL022724 


Homo sapiens 


dJ4l3H6.1.l (hamster 
Androgen- dependent Expressed 
Protein like putative 
protein) (isoform 1) 


1357 


100 


163 


AF125535 


Homo sapiens 


pp21 homo log 


193 


45 


1*4 


G03632 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7713. 


463 


97 


165 


AJ250839 


Homo sapiens 


serine/ threonine protein 
kinase 


1442 


71 


166 


L09649 


Zymomonas 
mobilis 


zm2 


173 


37 


167 


Y73337 


Homo sapiens 


HTRM clone 1944530 protein 
sequence. 


1204 


100 


168 


W88645 


Homo sapiens 


Secreted protein encoded by 
gene 112 clone HUXFC71 . 


1084 


100 


169 


AF214731 


Homo sapiens 


ATP-dependent RNA helicase 


4402 


100 


170" 


AE000871 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


166 


27 


171 


Y27684 


Homo sapiens 


Human secreted protein 
encoded by gene No. 118. 


821 


100 


172 


AF226044 


Homo sapiens 


HSNFRK 


2904 


100 


173 


AJ245946 


Homo sapiens 


neuroglobin 


779 


100 


174 


D43949 


Homo sapiens 


This gene is novel . 


3202 


100 


175 


Y07923 


Homo sapiens 


GTP -binding protein 


1205 


100 


176 


W90338 


Homo 
sapiens 


Human DPI homologue protein. 

* 


966 


100 


177 


Y41675 


Homo sapiens 


Human channel -related 
molecule HCRM-3 . 


1122 


100 


178 


Y41674 


Homo sapiens 


Human channel -related 
molecule HCRM-2 . 


936 


99 


179 


AF220492 


Homo sapiens 


Jcrueppel-liJce zinc finger 
protein HZF2 


4100 


99 


180 


X03084 


Homo sapiens 


Clq R-chain precursor 


1240 


100 


181 


U57344 


Mus musculus 


Meis3 


1813 


89 


183 


U57344 


Mus musculus 


Meis3 


1743 


86 


1B4 


U57344 


Mus musculus 


Meie3 


1070 


86 


185 


AF033120 


Homo sapiens 


p53 regulated PA26-T2 nuclear 
protein 


1389 


58 


186 


AF200357 


Mus musculus 


pantothenate kinase 1 beta 


1605 


82 




W75058 


Homo sapiens 


Human secreted protein 
encoded by gene 2 clone 
HLDBG33 . 


1188 


99 


188 


AJ292S29 


Homo sapiens 


suppressor of sterile four 1 


2424 


100 


190 


X54134 


Homo sapiens 


protein- tyrosine phosphatase 


3705 


100 


"191 


Y22203 


Homo sapiens 


Human calcium-binding 
phosphoprotein, CBPP-i, 
protein sequence. 


1083 


99 


192 


W63692 " " 


Homo 
sapiens 


Human secreted protein 12. 


197S 


100 


193 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


2605 


99 
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ACCESSION 
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SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 


194 


TV tart o "j» r*""K 1 ' 

AF084259 


Mus mus cuius 


br omodoma i n - con ta i n ing 
protein BP75 


693 


54 


195 


10 0752 


Rattus 
norvegicus 


serine dehydratase (AA 1 - 
327) 


994 


61 


1 QC 




Homo sapiens 


Human foetal brain secreted 
protein fhl70_7. 


2596 


100 ; 


197 


AB02o8b9 


Homo sapiens 


hDj9 


1890 


100 


198 


W95633 


Homo sapiens 


Homo sapiens secreted protein 
gene clone hm236_l. 


1614 


100 


199 


Y44277 


Homo 
sapiens 


Human nucleic acid methylase- 
2. 


2096 


99 


200 


AB030039 


Homo sapiens 


hPACPLl 


2258 


100 


201 


X54162 


Homo sapiens 


64 Kd autoantigen 


2918 


99 


202 


602061 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6142. ■ 


558 


99 


203 


X1386S 


Nicotiana 
tabs cum 


extensin (AA 1-620) 


1B5 


33 


204 


J04204 


Bos taurus 


32 kd accessory protein 


1837 


100 


205 


J04204 


Bos taurus 


32 kd accessory protein 


1101 


100 


207 


Y87283 


Homo sapiens 


Human signal peptide 
containing protein HSPP-60 
SEQ ID NO: 60. 


1318 


100 


208 


Y02B60 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 65. 


936 


98 


209 


AL121889 


Homo sapiens 


dJ1076E17.1 (KIAA0823 protein 
(continues in AL023803) ) 


£94 


54 


210 


AF226732 


Homo sapiens 


NPD007 


1345 


76 


211 


X66295 


Mus musculus 


Ciq C chain 


970 


73 


212 


Z29328 


Homo sapiens 


Ubiqu^tin-conjugating enzyme 
UbcH2 


966 


100 


213 


Z29328 


Homo sapiens 


Ubiquitin-conjugating enzyme 
UbcH2 


542 


98 


214 


AJ002030 


Homo sapiens 


progresterone binding protein 


1163 


100 


215 


X70649 


Homo sapiens 


member of DEAD box protein 
family 


3933 


100 


216 


AF2505S8 


Homo sapiens 


olaudin-2 


1169 


99 


217 


AL021453 


Homo sapiens 


dJB2lDll.l (PUTATIVE protein) 


259 


100 


218 


Y06565 


Homo sapiens 


UDP-GalNAc: polypeptide N- 
aoetylgaiactosaminyl trans f era 
se 


3331 

r 


99 


219 


Y94452 


Homo sapiens 


Human inflammation associated 
protein 


2067 


100 


220 


AL035521 


Arabidopsis 
thaliana 


putative protein 


315 


42 


221 


AL031786 


Schizosaccha 

romyces 

pombe 


putative proline- trna 
synthetase 


Bll 


41 


222 


AL109736 


Schizosaccha 

romyces 

pombe 


WD repeat protein 


626 


40 




X52493 


Glycine max 


DNA-directed RNA polymerase 


136 


23 


224 


AL03S^59 


Homo sapiens 


dJ979Nl.l (dJ979Nl.l) 


5199 


98 


225 


ABO 324 01 


Mus musculus 


mmDj4 


1761 


92 


226 


AB032401 


Mus musculus 


mmDj4 


1988 


92 


"227 


X83S02 


Sac char omyce 
s cerevisiae 


0*1007 " " 


112 


26 ■ 


228 


X83502 


Saccharomyce 


J1007 


79 


25 


229 


AF143723 


Homo sapiens 


heat shock protein HSP60 


2557 


99 


230 


Y66677 


Homo 
sapiens 


Membrane- bound protein 
PROB28. 


982 


100 


231 


AB027466 


Homo sapiens 


spondin 2 


1756 


99 


232 


W95634 


Homo 
sapiens 


Homo sapiens secreted , 
protein. 


1391 


100 


233 


K00365 


Homo sapiens 


Human cycTin Bl. 


2218 


99 


rm — - 


Y53762 


Homo sapiens 


A GTP -binding polypeptide 


1017 


100 
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SCORE 


IDENTITY 








designated RAQ . — — — - 






235 


Z50749 


Homo sapiens 


yeast sds22 homolog 


1800 


100 


236 


Z50749 


" Homo sapiens 




1754 


98 


237 


" AB026491 


Homo sapiens 


" PICK1 


2137 


100 


238 


""AJ270205 


Entodinium 
cauda bum 


putative 

phosphatidyl inositol-d - 
phosphate 5-kinase 


114 


37 


239 ■ 


"AB030189 


Mus mus cuius 


contains transmembrane (TM) 
region and ATP binding region 


710 


93 


240 


W56538 


Homo sapiens 


Human hedgehog interacting 
protein fHTPl 


3785 


99 


241 


W56538 


Homo sapiens 


Human hedgehog interacting 


3436 


99 


242 


AF155107 


Homo sapiens 


NY-REN-37 antigen 


996 


99 


243 


AF155107 


" Itomo sapiens 


NY -REN- 37 antigen ~ 


1005 


100 


244 


AL031320 


Homo sapiens 


aiJ2 0N2.i {novel protein 
similar to yeast and 
bacterial cytosine 
deaminase) 


763 


99 


24 5 


U37026 


Rattus 
norvegicus 


sodium channel beta 2 subunit 


162 


30 


246 


AL07B599 


Homo sapiens 


dJ99lC6.l (novel protein 
similar to C. elegans 
PS5A12.9 (Tr;P910B6)> 


2391 


98 


24 7 


U32274 


s cerevisiae 


xaxs obWp ; CAI : 0 . 12 


191 


37 


248 
249 


Y41^19 
AB029434 


sapiens 

nuiiiu £> ei£j jl en £> 


Human PR0864 protein 

sequence. 

ghrelin precursor 


1079 


100 


250 


X97831 


Rattus 


carnitine/ a cyl carnitine 
carrier protein 


611 
24£ 


100 
"38 


251 


W80993" 


sapiens 


Human RIP- interacting factor 
RIF. 


1724 


100 


252 


Y94873 


Homo 


Human protein clone HP02632. 


1876 


100 


253 
"254 


W5987B 


Homo sapiens 


Amino acid sequence o£ the 
cDNA clone AIP-2 (HSBGM49) . 


765 


100 




AL354S33 


LeTsHmanTa 
major 


possible adenylate Kinase 


265 


34 


255 


AF233322 


nuo muscuxus 


zinc transporter like 2 


1916 


95 


256 


Y7B113 


Homo sapiens 


Human cytokine signal 
regulator CKSR-1 SEQ ID 
NO: 1 . 


2247 


99 


257 


AL03553 9 


Arabidopsis 
thai i ana 


putative amino acid transport 
protein 


390 


27 


258 


W74787 


Homo sapiens 


Human secreted protein 1 
encoded by gene 58 clone 
HHPHN61. 


1171 


106 


"259 


AL03S^89 '"" 


Homo sapiens 


ou2.a /uix . jl (novel protein 
similar to protein kinase C 
inhibitors) 


974 


100 


260 


AE00O909 


Methanobacte - 
rium 

therraoautotr 
ophicum 


serine/threonine protein 

kinase rplah^rt nvnhaU 
rvA«a£>c Acidkeu prutein 


363 


30 


261 
262 


AL050131 
AF019661 


Homo sapiens 
Mus mus cuius 


hypothetical protein 

zeta proteasome chain; PSMA5 j 


626 
1214 


100 
100 


263 
264 

26S 


AL035593 " 
AL'022318 " 

AF205940 


Homo sapiens 
Homo sapiens 

Homo sapiens 


d»J310J6.i (novel protein) 
bK150C2.3 ( PUTATIVE novel " " 
protein similar to APOBEC1) 
endomucin 


821 
1072 

1289 


100 
100 

100 


266 

267' - 


AL023583 
AL03454B 


Homo sapiens 
homo sapiens 


dJ50OL14.i (novel protein) 
dJH03G7.3 (novel protein 
kinase domains containing 
protein similar to 
phosphoprotein C8FW) 


789 
1888 


100 
99 
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SEQ 
ID 
NO: 



26B 



369 
270 



ACCESSION 
NUMBER 



AF16147Q" 
AF161470 



X90763 



Homo sapiens 



Homo sapiens 



Homo 
sapiens 



DESCRIPTION 



HSPC121 



SMITH- 
WATERI4AN 
SCORE 



HSPC121 



HHa5 hair keratin type I 
intermediate filament 



1BB4 



1232 



2190 



IDENTITY 
98 



ST 



99 



271 

~TfT 



AF207600 
M32334 



Homo sapiens 



ethanolamine kinase - 



1952 



100 



Homo sapiens 



intercellular adhesion 
molecule 2 



1436 



100 



273 
274 



AF1614B3 
Y530S2 



Homo sapiens 



HSPC134 



663 



61 



Homo sapiens 



Human secreted protein clone 
df202_3 protein sequence SEQ 
ID NO:110. 



100 



Homo sapiens 



Human cytoskeletai protein 
(HCYT) {clone 2195418) . 



762 



100 



Homo sapiens 



3 OS ribosomal protein S7 
homo log 



1269 



loo" 



Homo sapiens 



Human secreted protein clone 
cai06_19x protein sequence 
SEQ ID NO: 20. 



1619 



98 



280 



281 



262 



283 



285 
266 



287 



288 



289 



290 



291T 
292 



Homo sapiens 



Z75134 



Amino acid sequence of a 
human phosphorylation 
effector PHSP-20. 



2801 



Cams 

familiaris 



rod transducin 



1816 



275134 



Canis 

familiaris 



rod transducin 



1718 



AF249873 



AL0500DT" 



Homo sapiens 



Homo sapiens 



muscle-specific protein" 



AF156102 



Homo sapiens 



Hypothetical protein" 



1395 



Y35897 



Homo sapiens 



DC1 



405 



Homo sapiens 



ELL complex EAP30 subunit 



1659 



U88964 

AL0 50143" 



Extended human secreted 
protein sequence, SEQ ID NO. 
146. 



1316 



1250 



| Homo sapiens 



Homo sapiens 



AF034801 



sapiens 



Homo sapiens 



HEM45 

hypothetical 



923 



telethonin 



protein 



598 



Membrane -bound protein 
PR0836. 



574 



2321 



liprin-alpha4 



2565 



'99 



100 



96" 



100 



98 
99 



99 



99 



100 



100 



100 



100 
98 



293 



"29T 



295 



296 



297 



298 
299 



AL0498S1 



aapie 



Homo sapiens 



I£prin-alpha4 



V73348" 



L11672 



AL035423 



Homo sapiens 
Horoo sapiens 



ctJ689J22B.l {novel protein 
(isoform 1) ) 

HTRM clone 83 9 4 Si protein 
sequence. 



2590 



1738 



124T 



Homo sapiens 



zinc finger protein 



API 98532 



Homo sapiens 



d«J20i3.1 (brain mitochondrial" 
carrier protein- 1 (BMCPl) ) 



169T - 



1024 



AF161417 



lymphoid enhancer binding 
factor- 1 



2173 



AF159141 



Homo sapiens 



HSPC299 



Homo sapiens 



breast cancer mecastasis- 
suppressor 1 



1147 



1236 



100 



100 



99 



44 



79" 



100 



85 



Rattus 
norvegicus 



inositol polyphosphate 4- 
phosphatase 



160 



30 



sapiens 



meningioma-expressed antigen 
5 



3458 



100 



302 
"303" 



ZB2022 ~ 
AP269232 



sapiens 



Mus musculus 



GlcNac-l-P transferase 



butyrophil in-like protein 
BUTR-1 



2067 



271 



99 



50 



Arabidopsis 
thaliana 



asparaginyl-tRNA synthetase" 



659 



SO 



ACT2'72079- 
V44486 



Homo 
sapiens 



hematopoietic ceil derived 
zinc finger protein 



"35T 



79 



306_ 
308 



Homo sapiens 



APOBEC-1 stimulating protein"" 



3056 



100 



Homo 
sapiens 



Human GPRW receptor 
polypeptide . 



1721 



100 



Homo sapiens 



DNA polymerase 



2598- 



100 



150 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


310 


AF293335 


Homo sapiens 


P30 DBC 


1248 


92 


311 


AF176525 


Mus muB cuius 


F-box protein FBL12 


1501 


93 


312 


X57802 


Homo sapiens 


immunoglobulin lambda light 
chain 


959 . 


81 


313 


Z36715 


Homo sapiens 


Net 


2048 


98 


314 


AF161532 


Homo sapiens 


HSPC047 


727 


100 


315 


AF208068 


Homo sapiens 


kelch-like protein KLHL3a 


3046 


100 


316 


Y66666 


Homo 
sapiens 


Membrane -bound protein 
PRO1013. 


1166 


100 


317 


Y29666 


Homo sapiens 


Human Ras protein RAPR-i. 


1253 


98 


318 


AJ387747 


Homo sapiens 


sialin 


2614 


99 


319 


AF161362 


Homo sapiens 


HSPC099 


224 


40 


320 


Y68773 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-5. 


2243 


99 


321 


AJ238379 


Homo sapiens 


putative THl protein 


3013 


100 


322 


AB040812 


Homo sapiens 


protein kinase PAK5 


3792 




323 


Y95013 


Homo sapiens 


Human secreted protein 
vc48 1, SBQ ID NO: 66. 


913 


100 


324 


Y13381 


Homo sapiens 


Amino acid sequence of 
protein PR0271. 


1976 


100 


325 


Y94944 




bfl57 16 Drofcein RprrDAnrA 
SEQ ID N0:94. 




98 


326 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein- 7sequence. 


6728 


99 


327 


AF198532 


Homo sapiens 


lymphoid enhancer binding 
factor- 1 


2173 


100 


328 


Z78013 


Caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


569 


33 


329 


AF212921 


Mus mus cuius 


MMTV receptor variant 1 


484 


94 


330 


Z75330 


sapiens] 

7 AO W ^ V / 

R65207 02- 
MAR- 1995 27- 
AUG- 1993 

Human 

stromalin-1. 

[Homo 

sapiens 


nuclear protein SA-l 

" 


6492 


99 

r 


33:. 


AXiO085B3 


Homo sapiens 


dJ327Ji6.3 (supported by 
GENS CAN, FGENES and GENEWISE) 


2133 


99 


332 


Y36104 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
489. 


310 


41 


333 


AJ271669 


Homo sapiens 


putative sialoglycopro tease 


1747 


100 


334 


AF156598 


Mus musculus 


p53 -regulated DDA3 


997 


64 


335 


M99058 


Eimeria 
maxima 


emlOO gene is homologous the i 154 
Eimeria tenella gene etlOO | 


26 


336 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs -UNC- 53 / 1 ) sequence . 


3386 


97 


337 


Y85564 


Homo sapiens 


Human homologue of UNC- 53 
(Hs-UNC-53/1) sequence. 


2602 


94 


338 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence. 


3447 


98 


339 


"ZS6561 


Caenorhabdit 
is elegans 


Similarity to Human rabl3 
protein {PIR Acc. No. 
A49647) . 


716 


34 


340 


AB021643 


Homo 
sapiens 


gonadotropin inducible 
transcription repressor- 3 


2761 


99 


341 


G0194* 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6027. 


465 ; 


98 


342 


AF020591 


Homo sapiens 


zinc finger protein 


1091 


48 


343 


L29154 


Homo sapiens 


immunoglobulin heavy chain 


439 • 


84 



151 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 
VDJ region 


SMITH- 
WATERMAN 
SCORE 


i 

IDENTITY 


344 


U10281 


Sus scrofa 


gastric mucin 


279 


24 


345 


AK000404 


Homo sapiens 


unnamed protein product 


1177 


99 


346 


L22SS7 


Rattus 
norvegicus 


calmodulin -binding protein 


1949 


84 


347 


L22S57 


Rattus 
norvegicus 


calmodulin- binding protein 


""2363 


91 


348 


AL049481 


Arabidopsis- 
thaliana 


AIGi-iike protein ~ 


316 


3 0 


350 


AJ251516 


Mus musculus 


cysteine and nistidine-rich 
protein 


1460 


49 


351 


AK024477 


Homo Bapiens 


FLJ0007O protein 


1773 


100 


"352 


U50133 


Homo sapiens 


ankyrin 


502 


33 


353 


AK000625 


Homo sapiens 


unnamed protein product 


721 


100 


354 




Homo sapiens 


HSPC302 


2523 


97 


355 


AJ010014 


Homo sapiens 


M95A protein 


1269 


47 


356 


AF15itt29™- 


Homo sapiens 


HSPC19* 


941 


91 


3S"7 


AL022327 


Homo sapiens 


dJ355C10 .1 (KIAAOO^TI 


1911 


100 


358 


W7812B 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96. 


1117 


100 ' 


359 

360 — 


X03414 
AF151079 


Drosophila 
melanogaster 
Homo sapiens 


HSPC245 


"vTe 


45 


361 


YS3B86 


Homo sapiens 


A suppressor of cytokine 
signalling protein 
designated HSCOP-6. 


643 
530 


100 
41 


3 62 

363 
364 


AP254741 

AF213465 
AF181562 


Drosophila 
melanogaster 
Homo sapiens 
•Homo sapiens 


dual oxidase 
proSAAS 


681 
2016 

1319 1 ' 


46 

100 
100 


365 
366 
367 

368 


AF181562 

U73200 

AP263744 

U37501 


Homo sapiens 
Mus musculus 
Homo sapiens 

mus musculus 


proSAAS 
pll6Rip 

erbb2 -interacting protein 
ERBIN 

laminin alpha 5 chain 


1624 
884 
I 4973 


99 
82 
99 


363 


AF043695 ~ 


Ca enorhabd i t 


S i mi lav to t"h«a nrni-oS r* " 


SB67 
549 


72 
36 






is elpgans 


phosphates 2c family 


370 
371 


Y73440 
AF272833 


Homo sapiens 
Homo sapiens 


Human secreted protein clone 
yj23_l protein sequence SEQ 
ID NO: 102. 
misato 


1484 


99 


372 


AF198454 


Homo sapiens 


epithelial protein lost in 
neoplasm beta 


2869 
3927 


100 


373 


Y73345 


Homo sapiens 


HTRM clone 43 8283 protein 
sequence . 


273 


"80 


374 
375 


AF169017 
taSlffg 


Homo sapiens - 
unidentified 


f ormlminot rans i era se 

cyclodeaminase 

RED ALPHA 


2717 


98 


376 

377 
378 


W74828 

Y32131 
M14912 . 


Komo sapiens 

Homo sapiens 
Homo sapiens 


Human secreted protein 
encoded by gene 100 clone 
HLQA352. 

Human LYST-2 protein, 
pol 


1202 
1012 

3556 


99 
99 


"379 
380 

im 


X66363 


Homo sapiens 
Homo sapiens" 


pRoosia 

serine/threonine protein 
kinase 


132 
382 
2499 


86 

100 " 
100 


Joi 


Y41699 


Homo 
sapiens 


Human PRO703 protein "™ " 
sequence . 


2362 


100 




AF17449B 


Homo sapiens" 


GR AF-l specific protein 
phosphatase 


7008 


98 


383 
"384 


U64608 
US0133 


caenorhabdit 
is elegans 
Aomo sapiens 


coded for by C. elegans cDNA 
ykl73cl2.5 ~ [ 
ankyrin 


246 


36 


"385 


AJ2385275 


Homo sapiens 


putative transcription 
factor-like nuclear regulator 


502 
4123 


33 
97 
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TABLE 2 



PCT/USOO/34263 



SBQ 
ID 

NO: 


| ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATBRMAN 
SCORE 


inPMTTTV 
LUCtSi X 1 1 X 


387 


AF208845 


Homo sapiens 


BM-003 


1375 


" 99 


389 


X57821 


Homo sapiens 


immunoglobulin lambda light 
chain 


797 


76 


390 


AF182404 


Homo sapiens 


mitochondrial uncoupling 
protein 1 


1670 


99 


391 


~ ¥8**64 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence. 


3386 


97 


393 


AF17B432 


Homo sapiens 


SH3 protein 


3700 


100 


394 


AF229928 


Drosophila 
melanogas fcer 


cytoplasmic protein 89BC 


1616 


62 


3 95 


AF181 l 721 


Homo sapiens 


RU2S 


2254 


100 


396 


Y69197 


Homo sapiens 


Amino acid sequence of a 
human betalV- spectrin 
protein. 


1626 


98 


397 


U4823B 


Mus mus cuius 


zinc f inCTQlf nrohfiin nonrn.Hd 


/Hit 


60 


398 


AL390137 




hvoofchefcieial nrnf-p> -t n 




51 


399 


AF217525 


Homo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


60 


400 


AL022599 


romyces 
pombe 


nu iepedi. ptocein 


447 


27 


401 


AC004B59 


Homo sapiens 


similar to 2-oxoglutarate 
dehydrogenase ; similar to 

W u « a J-U \ tr xu . gj. Ji^ b 1 o ) 


4176 


78 


402 


AB010266 


Mus musculus 


tenascin-X 


1024 6' 


*2 


403 


AL13328S 




avJo/ijj/.jL (similar to 
protein) 


761 


100 


404 " 


Z68753 


is elegans 


>iV>3<ID • JO 


888 


48 


405 * 


Z78013 " 


Caenorhabdi t 
is elegans 


SimilaritV tn nrncnnhi' 1 a 

Cadher in -related tumor 
suppressor 


era " 


33 


406 


AB031230 


Homo sapiens 


protein containing CXXC 
domain 2 


1196 


97 


407 


AF155106 


Homo sapiens 


NY- REN- 3 6 antigen 


HOD 


1DU 


408 


Y57945 


Homo sapiens 


Human transmembrane nmhmin 
HTMPN-69. 




99 


409 


218361 


Ovis aries 


tri^hohyalin 


,184 


30 


410 " 


AF249744 " 


Homo sapiens 


RhoGEF 


2733 


100 


411 


AF176529 


Mus musculus 


F-box protein FBX13 


2072 


94 


412 


AF210842 


Homo sapiens 




4 880 


100 


413 




Homo sapiens 


dJ310O13.7 (novel protein 
muiiiat to n . roreczi HKrh*i- 
3) 


776 


98 


414 


X57396 


Homo sapiens 


pm5 protein 


6131 




415 


A5029826 


Homo sapiens 


3 - me t hy 1 cro t onyl - CoA 
carboxylase biot in- containing 
subunit 


2961 


"99 


416 


U43503 


Saccharomyce 
s cerevisiae 


Lphlp 


115 




417 


AL160493 


Leishmania 
maj or 


possible t2Sfl7.21 


239 


~n 


418"' " 


YO8100 


Homo sapiens 


Human PR0331 protein. 


330 


29 ~ 


419 


015131 


Homo sapiens 


pl26 


2228 


54 


420 


AF117946 


Homo sapiens 


Link guanine nucleotide 
exchange factor II 


2363 


100 


421 


AF190635 


Drosophila 
melanogaster 


ankyrin 2 


755 


30 


422 


AF302150 


Homo ' 
sapiens 


phosphoinositol 3 -phosphate- 
binding protein- 2 


1962 | 


100 


423 


AL13753 0 


Homo sapiens 


hypothetical protein 


433 


"94 


424 


X63753 


Homo sapiens 


son-a 


7269 


100 


425 


AB027249 


Homo sapiens 


MAPKK like protein kinase 


1693 


100 


426 


AF279144 


Homo sapiens 


tumor endothelial .marker 7 j 
precursor j 


1084 


" i 



153 
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TABLE 2 



SEQ 
ID 
ci\j : 
~T->n 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


sWth- ■ 

WATERMAN 
SCORE 


* 1 

IDENTITY 






Homo sapiens 


tumor endothelial marker 7 
precursor 


1259 


56 




AawUj O {J J 


Drosophila 
melanogaster 


" CG8312 gene product 


14 9 


29 


429 


Y07829 


Homo sapiens 


RING tinger protein 


2201 


99 


4 JO 


AFQ96897 


Drosophila 
melanogaster 


pushover 


4442 


47 


431 


U413B7 


Homo sapiens 


Gu protein 


4021 


99 


432 


AF023674 


Homo sapiens 


nephrocystin 


3783 


100 


433 


AF146760 


Homo 
sapiens 


septm 2-like cell division 
control protein 


2284 


100 


434 


AB006697 


Arabidopsis 
thaliana 


cleft lip and palate 
associated transmembrane 
protein-like 


684 


42 


437 


Y94247 


Homo sapiens 


Human calcium binding protein 
hCBP. 


1704 


100 


438 


AB040672 


Homo sapiens 


UDP-GalNAc; polypeptide N- 

acetylgolactosaminyltransfera 

se 


1075 . 


63 


439 


AF105228 " 


Bos taurus 


tuftelin 


28* 


33 


440 


R064S3 


Homo sapiens 


Derived protein at clone 
ICA13 (ATCC 40553) . 


3073 


99 


441 


X14971 


Mus musculus 


alpha-adaptin (A) /AA 1-977; 


4897 


98 


442 


X53773 


Rattus 
norvegicus 


alpha -c large chain (AA l- 
938) 


3979 


81 


443 


Y66689 


Homo 
sapiens 


Membrane- bound protein 
PR01136. 


3299 


99 


444 


AC0677B4 


Arabidopsis 
bhaliana 


unknown protein; 20348-23707 


114 


33 


445 


AF229032 


Mus musculus 


pilj 


2077 


$3 


446 


AP056035 


Rattus 
norvegicus 


s-nexilin 


26*2 


85 


447 


AF132484 " 


Mus musculus 


unknown 


4 78 


51 


448 


W89024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 156 . 


528 


45 


449 


AF161445 


Homo sapiens 


HSPC327 


1606 


100 


450 


Z68753 


Caenorhabdit 
is elegans 


ZC518.3b 


951 


49 


4S1 


W39160 


Homo sapiens 


r Huraan partial complement r 
factor H protein fragment 3. 


155 


32 * 


452 


W85727 


Homo 
sapiens 


Novel protein (Clone 
BM46_10) . 




99 


453 


Y53629 


Homo sapiens 


A bone marrow secreted 
protein designated BMSH5. 


2810 


100 


454 


DB7438 


Homo 
sapiens 


Similar to a C. elegans 
protein in cosmid C14H10 


4069 


100 


455 


AF240468 


Homo sapiens 


nicastrin 


3687 


100 


456 


Z15005 


Homo sapiens 


UENP-E 


13305 


99 




MS9216 


Homo 
sapiens 


gamma- arainobutyric acid 
receptor beta-1 subunit 


2477 


100 


458 ; 


Y73467 


Homo sapiens 


Human secreted protein clone 
yd6l_l protein sequence SEQ 
ID NO: 156. 


966 


100 


459 


W*7824 


Homo sapiens 


Human secreted protein 
encoded by gene IB clone 
HSLFM29. 


535 


100 


4 60 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 

jj J> c v» ux a (J 


279 


19 


"461 


D87443 


Homo sapiens 


Similar to a C. elegans 
protein encoded in cosmid 
C27F2 (U40419) 


9196 


99 


462 


004044 


Homo sapiens 


Human secreted protein, seq 

ID NO: 8125. 


486 


93 


463 


AC002398 


Homo sapiens , 


F25965 1 


1018 


100 


464 


AF06485'£ 


Rattus sp. 


7acomp protein 


1645 


84 - 


465 


AF223408 


Homo sapiens 


B99 


3*86 


99 
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TABLE 2 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


I DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


466 

'Ml 


AF22340B " 


~ Homo sapiens 


"fB99 — 1 


2878 


B7 




AF104415 


Mus musculus 


gene trap locus -13 


6336 


■ 91 


46B 


U53450 

• 


Rattus 
norvegicus 


| JDP-1 


196 


49 


469 


AL031297 


"" Homo sapiens 


]dJ97P20.1 (novel gene) 


3564 


99 


470 


AF257077 




initiation factor EIF2B 
1 subunit 3 


1274 


95 


471 


L28125 


Podospora 
anserina 


i *^*-«* Lioiibuutiii- j. lite pronem 


284 


38 


472 


Y84903 


Homo sapiens 


A human proliferation and 
apoptosis related protein. 


2337 


100 


473 


AP144237 




i protein 


252 


44 


474 ~ 


Y71213 


Homo sapiens 


Human irritable bowel disease 
1 reiacea polypeptide IMX39. 


8J8 


100 


475 


Y95006" 




J Human secreted protein 
f veu_i / bby lu NO: 52. 


3411 


100 


476 


D3B549 


Homo sapiens 


J hal025 is new 


6533 


99 


477 


AF241230 


Homo sapiens 


1 TAK1 -binding protein 2 


3656 


100 


478 




Schizosaccha 
pombe 


putative asparagine synthase 


482 


40 


479 
"480 


L28125 


Podospora "~~ 
anserina 


j beta transducin-iike protein 


233 


26 




AF161544 


Homo sapiens 


HSPC059 


434 


77 


481 


AJ23824B 




I centaurin Jbeta2 


3986 


99 


482 


Z3 8061 


Saccharomyce 


mal5, seal, len: 1367, CAI: 
O.J, AMYH YEAST P08640 
GLUCOAMYTASE SI (EC 3.2.1.3} 


295 


23 


483 


AF161381 


Homo can1 Anc 




1404 


100 


484 


AF223468 


Homo QATJI one 


>vJUiii protein 


1314 


100 


486 
487 


*57527 
Y19062 


Homo sapiens 


alpha KVIli) collagen 
39K3 protein 


4166 


99 


488 


Y73373 




Hikm clone 921803 protein 
sequence . 


2475 
555 


100 
56 


489 


AL02191Q 


Homo 
sapiens 


b34!8.1 (Kruppex related Zinc 
Finger protein 184) 


4184 


100 


490 


3C53773 


norvegicus | 


alpha- c large chain (AA 1- 
938) , 


4675 


97 


491 


U52426 


Homo -sapiens | 


GOK 


1459 


59 


492 


AL359773 


Leishraania 
major" 


possible threonine synthase 


702 


45 


493 


AF22S614 1 


Homo sapiens | 


terroportinl 


2929 


100 


494 
495 


Z93241" 
AF036977 


Homo sapiens | 


acr222Ei3.i (novel protein 
with some similarity to 

u t uaupn± x a IvKAXivN / 

unknown "~ 


513 
1812 


96 
100 


496 | 
497 


U93564 
Y91405 


Homo sapiens 1 


P* w 

Human secreted protein 
sequence encoded by gene 2 


133 
357 


45 
100 


498 


AF069781 


Drosophila 
melanogaster | 


Bem46-like protein 


653 * ■ 


43 


499 


Y16601 


Homo sapiens 


*±**M*t*4J L y L1C 

phosphoprotein CECYP-2. 


1658 


98 


500 


X70944 


Homo sapiens 


PTB- associated splicing 

factor 


3883 


ioo 


501 
502 


AF027503 
AF282874 


Mus 

musculus J 
Homo sapiens | 


putative membrane -associated 
guanylate kinase 1 
nectin 3; PRR3 


205 
2856 


36 
99 


503 
504 
505 


AJ249732 
AF208B61 
L09708 


Homo sapiens J 
Homo sapiens | 
Homo sapiens |" 


G8 protein 

BM-019 ' " 
complement component C2 


£69 

1629 

4022 


100 

100 i 
100 


50? " " 
508 


X66285 
D00189 


<ub musculus |~ 
Rattus T 
norvegicus | 


HC1 ORF 

Na+,K+-ATPase alpha- subunit 


115 
5227 


43 "1 
99 
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TABLE 2 
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SEQ 
ID 

NO: 



ACCESSION 
NUMBER 



SPECIES 



DESCRIPTION 



Human secreted protein clone 
fal71JL protein sequence SEQ 
ID NO:148. 



SMITH- 
WATERMAN 
SCORE 



IDENTITY 



Y94971 



Homo sapiens 



2176" 



100 



Sll 



512 



AB019038 



Homo sapiens 



Homo sapiens 



AB019038 



Homo sapiens 



beta-1,4 mannosyltransf era se 
beta-1,4 roannosyl transferase 



781 



neta-1,4 mannosyitransf erase 



1347 



1520" 



77 



100 



X84908 



Homo sapiens 



phosphorylase kinase 



5729 



99 



515 



516 



X528S1 



Homo sapiens 



AF186084 



Homo 
sapiens 



G03602 



Homo sapiens 



pep t idyl prolyl isomerase " 
epidermal growth factor 
repeat containing protein 



650 



3046 



U04706" 



Human secreted probein, SEQ 
ID NO: 7683. 



505 



Bos taurus 



50 kDa protein 

Human secreted protein, SEQ 



1749 



76 



99 



99 



77 



Q00653 



Homo sapiens 



ID NO: 4734. 



530 



100 



AF161475 
Y99364 



Homo sapiens 



Homo sapiens 



HSPC126 

Human PR01475*TUNQ746) amino 
acid sequence SSQ ID NO : 8 8 . 



1368 



3394 



100 



97 



AF266BS2 
AE000995 



Homo sapiens 
Archaeoglobu 
a fulgidus 



PTPLA 



chromosome segregation 
protein (smel*) 



1295 



IsT 



100 



20 



AF062249 



Homo sapiens 



AJ223830 



immunoglobulin heavy chain 

variable region 

AREl ' — 



605 



57 



W0153S 



Rattus 
norvegicus 



2950 



Homo sapiens 



Cellular nomologue of the 
SV40 large T antigen. 



1276 



98 



83 



AF14 5658 



Drosophila 
melanogaster 



BCDNA.GH10229 



320 



33 



AF1122X3 



Homo sapiens 



putative Rab5- interacting 
protein 



524 



D49387 



Homo 
sapiens 



NADP dependent leuJcotriene b4 
1 2 - hydroxydehydrogenas e 



1616" 



100 



Homo sapiens 
Homo sapiens 



Human secreted protein 
encoded from gene 9. 



328 



dJl32F2l.3 (72.1 KDa protein 
(DKFZP564A032, SBBI88) 
similar to mouse I FN- gamma 
induce MG11. ) 



AL079335 



1055 



Human secreted protein 
sequence encoded by gene 56 
SEQID NO: 179. 



99 



Homo sapiens 



1159 



96 



Caenorhabdit 
is elegans 



carrier protein (c2) 



576 



5Q- 



X76116 



X12966 



Y09267' 



Caenorhabdit 
is elegans 



Homo sapiens 



carrier protein (c2) 



506 



3-oxoacyl-CoA thiolase 
propeptide {424 AA) 



1972 



50 



100 



Homo sapiens 



'Z11773- 



D84224 



Homo sapiens 
Komo sapiens 



flavin-containing 
monooxygenase 2 



2W 



SRE-ZBP 



methionyl tRKA synthetase* 
rnethlonyl tRNA synthetase 



2201 



4741 



100 



99 
99 



D84224 



Homo sapiens 
Homo sapiens 



887 



99 



D84224 



J03244 



Homo sapiens 



Bos taurus 



methionyl tRNA sy nthetase^ 
methionyl tRNA synthetase 



2933 



Y92514 



Homo sapiens 
Homo 



H+- ATPase 31kDa subunit (EC 
3.6.1.3) 



4529 



848 



Human QXRE-11. 
Smad- and 01 f -interacting 



2301 



96 



99 



77 



99 



AF221712 



AE000919 



sapiens 



zinc finger protei n 
conserved protein 



2151 



61 



Methanobacte 
rium 

thermoautotr 
ophicum 



207 



A0666 9 



synthetic 
construct 



preTGF-betal 



2070 



38 



99 



156 



WO 01/53312 



PCT/USO0/34263 



TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


* 

IDENTITY 


546 

C A 1 


Y02698 
AF 112 20 5 


Homo sapiens 
Homo sapiens 


Human secreted protein 

encoded by gene 49 clone 
HTPCS60 . 
wsb-1 protein 


854 * " 
2275 


98 
100 


548 
549 


X60271 
AC016827 


Mus musculus 

Arabidopsis 

thaliana 


c-rel 

putative GTPase 


2264 
B10 


74 
"42 


5bO 


Y70400 


Homo 
sapiens 


Human cell- signalling 
protein- 2 . 


429 


" 68 


551 


AB048365 


Homo sapiens 


NEDD4-liXe ubiquitin iigase 1 


8290 


99 


552 


Y57880 


Homo sapiens 


Human transmembrane protein 
HTMPN-4. 


1112 


95 


553 


" "5*11985* 


Homo sapiens 


PR01847 


265 


67 


554 


Ml 7236 


Homo sapiens 


MHC HLA-DQ alpha precursor 


1332 


100 


555 


* AL07B468 


Arabidopsis 
thaliana 


putative protein 


540 


40 


556 
557 


~C006963 — 
AK024487 


Homo sapiens 
Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 
<PlD:g4650B44) 
FLJO00B6 protein 


515 
1623 


44 
98 


558 
559 

560 


M12140 
W74825 

X56S81 


Homo sapiens 
Homo sapiens 

Homo sapiens 


pol gene protein; Xxx 
Human secreted protein 
encoded by gene 97 clone 
HAQBF73 . 
3unD protein 


117 
225 


48 

"56 


561 


AF003136 


Caenorhabdit 
is elegans 


contains weak similarity to 
an AMP-binding motif 


373 
2926 


88 
54 


562 


AL109839 


Homo sapiens 


CU1069P2.3.1 (novel PABPC1 
(poly (A) -binding protein) 


B77 


100 


563 


AF181640 


Drosophila 
melanogaster 


BCDNA.GH09817 


289 


42 


564 

566 
567 
569 


AF052723 

AF161472 
Y28817 
U09848 
AF155113 


?eline 

leukemia 

virus 

Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


gag-pol precursor poiyprotein 
gPrBO 

HSPC123 

pt326_4 secreted protein, 
zinc finger protein 
NY-RBN-5S antigen 


1547 

439 
3338 
1738 
3603 


43 

44 
100 

fioo 

93 


570 
571 
572 
573 
574 


AP15S113 

AL032821 

M69181 

M69181 

Y5967B 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo qani otic 

Homo sapiens 


NY-REN- 55 antigen 

dJ55C23.1 (vanin 1) 

non-muscle myosin B j 

non- muscle myosin B 

Secreted protein 108-008-5-0-"" 

E6-FI,. 


3951 
1821 
7350 
7311 
772 


99 r 

98 ~ 

99 

98 

100 


575 


AL365234 


Arabidopsis 
thaliana 


putative protein 


788 


40 


576 


AL3 65234 


Arabidopsis 
thaliana 


putative protein 


788 


40 


577 
578 


X0674S 
AB041642 


Homo sapiens 
Homo sapiens 


ui?4i\ puAytncitaoe aipna-suDunit 
(AA 1 - 1462) 
PAR- 6 


7619 


99 


579 
"580 


08*984 


Homo sapiens 


Similar t*n vpant* arionul 

cyclase (S56776) 


1342 
2446 


100 
100 




AF165124 


Homo sapiens 


gamma -arainobutyric acid A 
receptor gamma 2 


2499 


99 


581 


W88812 


Homo sapiens 


Polypeptide fragment encoded 


2339 


99 ■ 


582 


D82319 


Homo sapiens 


by gene 58. 

nov*> 1 fyp, i? ' - 






["583 
584 


P92219 
AJ22394B 


Homo sapiens 
(human) 
Homo sapiens 


"uvci. war 

CR1 protein. 
RNA helicase 


342 
11425 


100 
99 


585 


Y08612 


Homo sapiens 


SBlcDa nuclear pore complex 
protein 


6608 
3874 


99 
99 


586 
587 


Y42384 
RF129756 


Homo 
sapiens 
Homo sapiens 


Amino acid sequence of " 

Iv3l0 7. 

BAT4 


1007 
1873 


37 
98 



157 



WO 01/53312 



PCT/USOO/34263 



TABLE 2 



SEQ 
ID 

NO: 


1 ACCESSION 
NUMBER 


SPECIES 




SMITH- 
WATbRMAN 


* 

IDENTITY 


588 


AP13177^ 


~ Homo sapiens 


~ Unknown ~ 


1929 


99 


589 


AJ250B65 


Homo sapiens 


TESS 2 ~ 


2348 


100 


591 


Z9B885 


~ Homo sapiens 


(JJ522J7.2 (bromodomain- 

containing 1 (similar to 
peregrin, BR140) ) 


*iib / 


100 


592 


L76571 


Homo sapiens 


nuclear hormone receptor 


1355 


100 


593 


1PDQ1 COO 


Homo sapiens 


PHD finger protein 3 


9054 


100 




VCCOft'T 

A3 b □ U / 


Homo sapiens 


desmocollin type 2a ~ * 


4443 


ioo ■ 


595 


AL137802 


Homo sapiens 


CU79BA10.1 (novel protein) 


" 212 


55 


596 


AL022329 


Homo 
sapiens 


bK4 07Fli.2 (adrenergic, beta, 
receptor kinase 2) 


3653 


100 


597 


AF226048 


Homo sapiens 


GL003 


2009 


99 


598 


1X.TO 1011*5 


Homo 

sapiens) 

>Y49635 


putative cell cycle control 
protein 


335 


23 


















OCT- 1999 15- 












JV DO 1 ftfto 

ArK-199B 












tinman J. ^ c 

rtuuian S ap J . a 

protein. 

[Homo 

sapiens 








599 


Y59741 


Homo sapiens 


Human normal ovarian tissue 
derived protein 10. 


1574 


99 


600 


L36531 


noino sap x ens 


integrin alpha B subunit 


5386 


99 


601 


Y36458 


n.uiuu scipxsris 


Human secreted protein 
encoded by gene No . 20. 


895 


100 


602 


AF21S584 


Homo sapiens 


GGA1 


3265 


100 


603 
604 


Y13115 

AL132776 


nuitw sopietis 
Homo sapiens 


serine/ threonine protein 
kinase 

dJ393D12.1 (KIAA0776) 


5071 


99 


60S 


AL0344S2 


xiuiifu sapiens 


aubo^uib.i movei Collagen 
triple helix repeat 
containing protein) 


2413 
1979 


99 
100 


£o£ 


Y14494 


Homo sapiens 


aralarl 


3465 


99 


607 


AJ001981 


Homo sapiens 


OXA1L 


2603 { 


100 


608 


X86098 


Homo 
sapiens 


binds directly to adenovirus 
type 5 E1A protein 


"3069 


100 


610 "" 
611 

612 ~ 


AF163572 
AF161503 


Homo sapiens 


Forssraan glycol ip id 
synthetase 


1865 
1261 


99 
97 


613 

614 
615 
616 


1*1834 
Y91954 

AL022327 

X85786 

Y08319 


Bnsis minor 

Homo sapiens 
Homo sapiens 


nuclear protein 

protein 9 (CYSKP-9) . 

binding regulatory factor 
kinesin-2 


345 
3668 

361 
3203 


30 
100 

94 
100 


617 I 
618 

"SIS — 


D12644 
U28789 
Y35914 


Mus muscuJLus 
Ious^^uIbcuTus - " 
Homo sapiens - " 


PACT L 

Extended human secreted 


3487 
3609 
5936 
1684 


99 
97 
89 
99 








f 4 - "-"-*— xii sequence, oCiU ±ij wu . 
163. 




620 


A30463 82 


Mus rausculus " 


test is -abundant finger 
protein 


199 


23 


621 
622 


Y00062 
AFD68286 


Homo sapiens " 
Homo sapiens 


to 1120) 
HDCMD38P 


3440 
861 


99 
100 


623 

624 " 


X98248 
X61100 


Homo sapiens 
Homo sapiens 


sortilin 

75 kDa subunit NADH 
dehydrogenase precursor 


4436 
3734 


99 

99 " " 


625 
"626 


S5B544 
AF151027 


Homo sapiens 
Homo sapiens 


75 Jcda infertility- related 

sperm protein 

HSPC193 


2125 


99 


627 

r S5T 


X1496"B 


Homo sapiens 


Rl I -alpha subunit (AA 1-404) 


582 
2079 


93 
100 




Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7_l derived protein 


1983 


100 



158 



WO 01/53312 



TABLE 2 



PCT/US00734263 



SEQ 
ID 

NO: 


NUMBER 




DESCRIPTION 


SMITH"- 
WATERMAN 
SCORE 


* 

IDENTITY 


629 


Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7_l derived protein 


1694 


100 


630 


AP098786 


""Homo 

sapiens 


17 beta -hydroxys teroid 
dehydrogenase type VII 


1754 


100 


631 


" ALD34555 


sapiens 


OJ.1J40JL9 . 3 (zinc linger 
protein J.3JL ipH£-©7) J 


4273 


100 


632 


W74826 




Human secreted protein 
encoaea oy gene 38 clone 
HAQBT94 . 


794 


96 


633 


AF2B8288 






2236 


. 100 


634 


AF041429 


Homo sapiens 


pRGRl 


823 


99 


635 


X66357 


Homo sapiens 


serine/ threonine protein 
kinase 


1589 


100 


636 


Y112B4 


"Uiiiu oopi S Ji 9 


AFXl 


2571 


98 


637 


AR004884 


Homo sapiens 


PKU-aipna 


3718 


99 


63 8 


nu V £t w 


Homo sapiens 


synaptogyrin lc 


1020 


100 


639 


AJ002304 


Homo sapiens 




"i002 


ioo 


OH\J 




Homo sapiens 


synaptogyrin lc 


933 


94 


641 


D87682 


Homo sapiens 


similar to a C.eiegans 
protein encoded .in co3mid 
T26A5 . 


26"76- 


100 


642 


M14660 


Homo sapiens 


ISG-K54 


2473 


99 


01 J 


X066 6 1 


Homo sapiens 


calbindin (AA 1-261) 


1358 


100 


644 

CAC 


AF119900 


Homo sapiens 


PR028 22 


185 


^6 




ABU 31Q48 


Drosopnila 
melanogaster 


microtubule associated- 
protein orbit 


738 


27 


646 " 


AF250842 


Drosophila 
melanogaster 


multiple asters 


834 


29 


647 


X86691 


Homo sapiens 


Mi-2 protein 


10110 


99 


648 


U67934 


Homo sapiens 


44.9 JcDa protein C18B11 
homo log 


827 


96 


649 


AF236061 


Oryctolagus 
cuniculus 


RING- finger binding protein 


3330 


91 


650 


AL034553 


Homo sapiens 


dJ9i4P20.2 (KIAA0784 protein 
similar to Mus musculus 
activity- dependent 
neuroprotective protein 
(Adnp)) 


5708 


100 


653 


X14766 


.Homo sapiens 


GABA-A receptor alpha l 
subunit 


2388 


99 


654 




Homo sapiens 


similar to f-spondin proteins 
AB006086 (PID:g2529225) 


3026 


99 





yg^gflk 


Homo sapiens 


Human transmembrane protein 

HiMrN-32 . 


60B 


99 


656 " 


Z34975 


Homo sapiens 


IdlCp 


3733 


100 


658 


AL0S03 06 


XT r—\Tn r> oa*\4 Ann 

numo odpiens 


du475B7.2 (novel protein) 


1942 


99 1 


659 


W76734 


sapiens 


Human mDia Rho targeting 
protein. 


781 


34 


660 


AF202724 


ZJ f\Tfsf\ oani ana 

nuino sapiens 


Sadl unc-84 domain protein l 


2172 


100 


661 


Z21966 


Homo sapiens 


mPOU homeobox protein 


1529 


100 


662 


AJ242954 


mus mus cuius 


dysJEerlin 


4752 


59 


663 


AFZ82315 




myof erlin 


6232 


99 


6^5" 


All 6 151 6 


/uaDluupslS 

thaliana 


hypothetical protein 


209 


30 


" 667" 


X593 03 


Homo sapiens 


valyl-tRNA synthetase 


3393 


99 


668 


Y133 SS 


Homo sapiens 


Amino acid sequence of 
protein PRO220 . 


3*92 


ioo — 


669 


AB010692 


Arabidopsis 
thaliana 


beta-N-acetylglucosaminidase 
gene 


"fill 

on 




671 


X56123 


Kus musculus 


talin 


4474 


"76 


672 


AB039371 


Homo sapiens 


mitochondrial ABC transporter 
3 


2902 


99 


673 


AF269223 


Homo sapiens 


TCP11 


806 


42 


674 


AF229633 


Mus musculus 


groucho-related protein 4 


4053 


99 


[ 675 


1*14463 


Rattus 


' transducin 


3619 


92 



159 



WO 01/53312 



TABLE 2 



PCT/US00/34263 



SEQ 
ID 

NO: 


ACCESSION 
NUMB BR 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






norvegicus 








676 


AC005757 


Homo sapienB 


R32611 1 


2779 


100 


677 


S61069 


Homo sapiens 


reverse transcriptase 
homolog=pol {retroviral 
element} 


252 


65 


678 


AF271388 


Homo sapiens 


CMP-N-acetylneuraminic acid 
synthase 


2273 


100 


679 


X79066 


Homo sapiens 


ERF-1 


1783 


100 


660 


AF118566 


Mus raus cuius 


hematopoietic zinc finger 
protein 


769 


50 


681 


Y51415 


Homo 
sapiens 


Human wild type pKe83 
protein. 


2621 


99 


682 


AL13354S 


Homo sapiens 


^bA386Ni4.1 (novel protein 
similar to a dual specificity 
phosphatase) 


700 


68 


683 


Y86214 


Homo sapiens 


Nuclear transport protein 
clone hfb34l protein 
sequence. 


5888 


99 


684 


Y94952 


Homo sapiens 


Human secreted protein clone 
fhll6_ll protein sequence 
SEQ ID NO: 110. 


354 


98 


685 


AL021878 


Homo sapiens 


dJ257I20.4 (transcription 
factor 20 (AR1) (KIAA0292) 
(isoform 2) ) 


"154 * 


67 


686 


AB000198 


Escherichia 
coli 


orf, hypothetical procein 


628 


100 


687 


M58378 


Homo sapiens 


synapsin I 


3730 


99 


6BB 


AF03 9697 


Homo sapiens 


antigen NY- CO- 31 


SOB 


98 


689 


U09355 


Oryctolagus 
cuniculus 


protein phosphatase 2A1 B 
gamma subunit 


2356 


99 


690 


AF155106 


Homo sapiens 


NY- REN- 3 6 antigen 


265 


50 


691 


AC004774 


Homo sapiens 


Dlx-5 


1S42 


100 


692 


X90S30 


Homo sapiens 


ragB 


192* 


99 


693 


X9053O 


Homo sapiens 


ragB 


1405 


99 


694 


X90530 


Homo sapiens 


ragB 


1590 


85 


695 


G01563 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5644. 


330 


100 


696 


ACullBlO 


Axabidopsls 
thaliana 


Putative methionine 
aminop apt ids s e 


669 


52 


697 


AJ250425 


Rattus 
norvegicus 


Collybistin I 


2455 


98 


698 


AB037901 


Homo 
sapiens 


gene amplified in squamous 
cell carcinoma -1 


5364 


99 


699 


¥95*4 01 


Homo sapiens 


Human PR01327 (UNQ687) amino 
acid sequence SEQ ID NO: 218. 


1386 


100 


701 


AP221712 


Homo 
sapiens 


Smad- and Olf -interacting 
zinc finger protein 


6705 


160 


702 


X83573 


Homo sapiens 


ARSE 


3184 


99 


703 


AJ243274 


Homo sapiens 


AP-2rep protein 


2078 


99 


704 


Y71262 


Homo sapiens 


Human chondromodulin-like 
protein, Zchml . 


1697 


94 


705 


Y71262 


Homo sapiens 


Human chondromodulin-lifce 
protein, Zchml. 


1736 


99 


706 


V41257 


"Homo sapiens 


Amino acid sequence of long 
human FAIM. 


1060" 


100 


707 


AL022237 


Homo sapiens 


bK119lB2.3 (PUTATIVE novel 
Acyl Transferase similar to 
C. elegans C50D2.7) (isoform 
1) ) 


2030 


100 


70B 


AJ006266 


Homo sapiens 


AND-l protein 


5942 


100 


709 


G01571 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5652. 


111 


99 


710 


Y08698 


Homo sapiens 


ranJbp3 


2849 


98 


711 


Y68770 


Homo sapiens 


Amino acid sequence or a 
human phosphorylation 
effector PHSP-2 . 


754 


99 " 



160 



WO 01/53312 



TABLE 2 



PCT/USOO/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


712 


U93574 


Homo sapiens 


putative pi 50 


799 


59 


713 


AC004531 


Homo sapiens 


Gene with similaity to DEAD 
box helicases 


2715 


99 


714 


D89016 


Homo sapiens 


Neuroblastoma 


538 


48 


715 


Y92175 


Homo sapiens 


Human cardiovascular system 
associated protein tyrosine 
phosphatase 2. 


734 


98 


716 ' 


AL137013 


Homo sapiens 


DA311P8.3 (probable uracil 
phosphor ibosyltranf erase) 


862 


100 


117 


AB035123 


Mus raus cuius 


GDI alpha/GTia alpha /GQlb 
alpha synthase 


1696 


93 


718 


Y96290 


Homo >P40254 
P40254 25- 
OCT-1984 09- 
APR-1983 
Human IgD. 
[Homo 
sapiens 


Human IGFAM-2 immunoglobulin. 


2345 


85 


719 


X6V979 


Homo sapiens 


integrin beta 1 subunit 
precursor 


4347 


99 


720 


AJ224819 


Homo sapiens 


tumor suppressor 


2149 


99 


721 


Y07595 


Homo sapiens 


transcription factor TFIIH 


2373 


100 


722 


N41565 


Homo 

sapiens) 

>W41564 

W41564 08- 

OCT-1997 05- 

APR-1996 

Human 

calpain. 

[Homo 

sapiens 


Human calpain. 


1591 


99 


723 


AF161341 


Homo sapiens 


HSPC078 


1097 


98 


724 


AF187318 


Homo sapiens 


F-box protein Fox2 


1607 


100 


725 


AC006708 


Caenorhabdit 
is elegans 


contains simlarlty to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB:Z72876) 


1143 


46 


726 


AC006708 


Caenorhabdit 
is elegans 


contains axmlarity to 
Saccharomyces "cerevisiae pre- 
mRNA splicing protein PRP31 
(GB;272876) 


988 




727 r 


AC024 818 


Caenorhabdit 
is elegans 


contains similarity to Pram 
family PF00400 (WD domain, 
G -beta repeat) , score-81.8, 
E«1.4e-20, N=3 


950 


44 


72 8 


AJ005897 


Homo sapiens 


JM5 


831 


47 


729 


Y45377 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
27. 


90B 


"97 


730 


G03931 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8012. 


578 


100 


731 


AB012720 


Oncorhynchus 
ma sou 


GTP -binding protein 


3865 


76 


732 


W73404 


Homo sapiens 


Human secreted protein 
encoded by Gene No . 8 . 


842 


97 


•m 


G0265O 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6731. 


644 


"97 


734 


AC024813 ! 


Caenorhabdit 
is elegans 


Hypothetical protein 
Y54FlOAL.a 


152 


24 


735 


AL0354 61 


Homo sapiens 


OJ967N21.6 (novel CDP-alcohol 
phosphatidyl transferase 
family member protein) 


1562 


98 


736 


U00033 


Caenorhabdit 
is elegans 


similar to S. cerevisiae YJU2 
protein 


605 


41 


737 


AF07909B 


Homo 
sapiens 


arginine- tRNA-protein 
transferase 1-lp; ATEl-lp 


2733 


99 



161 



WO 01/53312 



TABLE 2 



PCT/USOO/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 






SMITH - 
WATERMAN 
SCORE 


IDENTITY 


738 


AJ131712 


Homo sapiens 


nucleolar RNA- hf»1 i rARP 


2793 


t nn 

JLUU 


739 


AJ133115 


Homo sapiens 


TSC-22-like protein 


2054 


99 " 


740 


X98258 


Homo saoiens 






t nn 


741 


X98258 


Homo sapiens 


M-phase phosphoprotein 9 


564 


74 


742 


U97191 




strong Biroiiaricy CO tne irii 

Riih-f atti^ W nf Q&C nmf oino 
awiauiixy ot Ww piOUclilo 


9bU 


85 


743 


X76057 


Homo sapiens 


phosphomannose i some rase 


2191 


100 


744 


G03209 


nutno bapicns 


numan secreceu protein, obu 

■IV flu. 1 AZt\t . 


496 


98 ! 


745 


X97064 


nonio o ctpiuno 




4034 


99 


746 


W93946 


Homo sapiens 


Human regulatory molecule 
HRM-2 protein. 


994 


100 


747 


Y73388 


Homo sapiens 


HTRM clone 3376404 protein 
sequence . 


1565 


99 


1 AO 




Sus Bcrofa 


tolliscatin A 


1906 


98 


749 


AJ249457 


Tri chomonas 
vaginalis 


centrin, putative 


183 


28 


750 


AC004410 


Homo sapiens 


£os39554 1 


2094 


100 




AF07496B 


Homo sapiens 


p47ING3 protein 


2167 


100 


752 


AF252284 ™ 


Homo sapiens 


transcription specificity 
factor Spl 


4005 


100 




AB04 9629 


Homo sapiens 


phospholysine 

phosphohistidine inorganic 
pyrophosphate phosphatase 


1375 


99 


" Y5"4 




Homo sapiens 


ribosomal protein L3 9 


160 


*77 1 


f 99 




Homo sapiens 


CDEP 


142 


29 


75B 


1*32162 


Homo sapiens 


transcription factor 


574 


80 


759 


AF037204 


Homo sapiens 


RING zinc finger protein 


295 


54 


"ncn 

7bU 


Y44250 


Homo 
sapiens 


Human cell signalling 
protein- 13. 


625 


100 


761 


AF218586 


Homo sapiens 


Cide-b 


1136 


100 


762 


U38934 


Gallus 
gallus 


histone H2A 


625 


97 


763 


AF226053 


Homo sapiens 


HSKM-B 


606 


32 


764 


X13403 


Homo sapiens 


Oct-1 protein (AA 1 - 743) 


362 b 1 


ioo 


765 


D87446 


Homo sapiens 


Similar to a C. elegans 
protein encoded in cosmid 
C27F2 (040419) 


568 


38 


76b 


AL023828 


Caenorhabdi t 
is elegans 


Y17G7B.14 


200 


27* 


767 


YB2777 


Homo sapiens 


Human chordin related protein 
(Clone dw665 4) . 


25*1 


99 


76B J 


X92475 


Homo sapiens 


1TBA1 


1429 


100 


ICQ 

toy 


Y42752 


Homo sapiens 


Human calcium binding protein 

3 (CaBP-3) . 


1426 


100 


770 


X514 16 


Homo sapiens 


hormone receptor hBRRl (AA 1~ 
521) 


2641 


97 


771 


"AJ006591 


Homo sapiens 


cysteine-rich protein 


1793 


100 


772 " 


£08695 " 




rap 2 


935 


100 


773 


"Z12173 




N-acetylglucosamine- 6- 
sulphabase 


2970 


100 


774 


Y919S0 




Human cytoskeleton associated 
protein 5 (CYSKP-5) . 


DbD 


43 


776 


AL023799 


WtfVUrt can 4 one 
"UlllU bap i. til J o 


A770 0P7 1 — Ivlrin f nrro-rl 

\MJ344ri,x izinc ringer/ 


obb 


56 


111 


AL023799 


Homo sapiens 


HJ322P7.1 (zinc ringer) " 


855 


56 1 


778 


(301980 


U^mrs aarti Ana 

nomo sap i ens 


Human secreted protein, SEQ 
ID NO: 5961 . 


849 


98 


779 


AJ012590 


Homo sapiens 


glucose l- dehydrogenase 


4155 


99 


7B0 


ALQ78582 


Homo sapiens 


CU130E4.2 (KIAA0796) 


1321 


68 


781 


Z75955 


uaenorhabdit 
is elegans 


similar to mitochondrial 
carrier protein 


384 


34 


782 


AL109965 


Homo 
sapiens 


dJ1121G12.2 (SCAN domain- 
containing 1 protein) 


900 


100 


783 


AF061262 


Mus 

musculus 


semaF cytoplasmic domain 
associated protein 2 


1316 


83 


784 


G03873 




Human secreted protein, SEQ 


649 


95 
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TABLE 2 



PCT/USOO/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECiES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

T DENT T TV 








ID NO: 7954. 






7B5 


Y84441 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


2074 


100 


785 


Y00918 


Homo sapiens 


Human Rab protein, RABP-l, 
protein sequence. 


1048 


99 


787 


Z97029 


Homo sapiens 


ribonuclease HI large subunit 


1548 


99 ■ 


788 


AB035384 


Homo sapiens 


SRp25 nuclear protein 


962 


94 


789 


AF024631 


Homo sapiens 


ANG2 


2644 


inn 


790 


AJ-006710 


Rattus 
norvegicus 


phosphatidylinositol 3 -kinase 


"4500 


97 


792 


V0063B 


bactenophag 
e lambda 


reading frame ealO 


600 


100 


793 


AF049103 


Homo sapiens 


Huntingtin interacting 
protein 


Bl9 


inn 


795 


Z26317 


Homo sapiens 


desmoglein 2 


4810 


99 


795 


Y768B4 


Homo sapiens 


Retinoblastoma binding 
protein— 7sequence . 


5080 


99 


797 


U15155 


Gallus 
gallus 


trypsinogen 


3 72 




37 


798 


U971B9 


Caenorhabdit 
is elegans 


strong similarity to thw 
P13/P14 family of kinases 


227 


28 


799 


AF112201 


-Homo sanicriQ 


iicuj. uiicii. pruiein vtyzo 


1053 


100 


800 




Rattus 
norvegicus 


serine-arginine-rich splicing 
regulatory protein bHXPoo 


958 


63 


801 


AF267852 




placental procein lj-iiKe 
protein 


743 


99 


802 


AF208851 






766 


B0 


803 " 


ZB1097 


Caenorhabdit 
is eleaans 


Similarity to Human 

toUl IlVJU X clS U UTuol Ul li CI J. n M 

nirofc^in RRAPAC vlrfflC'Of*! "> c 
COITIS a from fK { o rr&n n 

\*WMlwo im Will i*UXO MwXlv2 


152 


27 


804 


G02113 


Homo sapiens 


ID NO: 5194. 




So 


605 


AL121673 


Homo sapiens 


DA305P22.1 (novel protein) 


1160 


100 


806 


AC013483 


Arabidopsie 
thaliana 


putative GTPase activator 
protein 


264 


30 


807 


AC013483 


Arabidopsis 
thaliana 


protein 


264 


3C 


808 


AB013B85 


Homo sapiens 


be t a — ure idonroo i nn a q t> 


14 94 


100 


809 


AF078842 


Homo sapiens 


HUi'i'L protein 


1581 


99 


810 


AF161421 


Homo sapiens 


HSPC303 ~~~~ 


2134 


96 


811 


AF261689 


Homo sapiens 


DNA polymerase epsilon pl7 
subunit 


734 


100 


812 


Z74029 


caenorhabdit 
is elegans 


Similarity to c. elegans 
alcohol dehydrogenase comes 
from this gene 


610 " 


71 

* 


813 * 


Z73497 


Homo sapiens 


CU24 0C2.2 (Core his tone 
H2A/H2B/H3/H4) 


324 


100 


814 - 


W87689 


Homo 
sapiens 


Human HTXFT19 polypeptide. 


1484 


"99 


815 


X16282 


Homo 
sapiens 


zinc finger protein (217 AA) 
(1 is 2nd base in codon) 


1109 


99 


816 


Z92539 


Mycobacteriu 
m 

tuberculosis 


pth 


300 


JO 


818 


AB0304B3 


Mus musculus 


B9 


197 


27 


819 


AL11755S 


Homo sapiens 


hypothetical protein 


321 


94 


820 


AC005328 | 


Homo sapiens 


R2S660_2, partial CDS 


865 


97 


821 


G03951 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8032. 


700 


"99 


822 


L34807 


Musca 
domestica 


transposase 


174 


20 


823 


G02928 


Homo sapiens 


Human secreted protein. SEQ 
ID NO: 7009. 


558 


'78 


824 


Z99531 


schlzosaccha 


caffeine -induced death 


184 


29 
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TABLE 2 



PCT/US00/34263 



5 HQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


t 

IDENTITY 






romyces 
pombe 


protein 1 






825 


AJ006692 


Homo sapiens 


ultra high suiter keratin 


" 693 


" 68 


826 


U23037 


Oxyctolagus 
cuni cuius 


elF-2Bepsilon 


3406 


90 


827 


GQ3412 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 74 93 , 


464 


100 


828 


Y30327 


Homo sapiens 


Human secreted protein 
encoded from gene 17. 


113 


44 


829 


" Y32199 


Homo sapiens 


Human receptor molecule (REC) 
encoded by Incyte clone 
2022379. 


' "1012 


100 


830 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 


832 


AB011542 


Homo sapiens 


MEGF9 


"2097 


100 


833 


d02639 


Homo sapiens 


Human secreted protein, SEQ 
ID Np; 6720. 


223 


70 


834 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1574 


ioo 


835 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1144 


89 


836 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1446 


94 


037 


X12517 


Homo sapiens 


C protein (AA 1-159) 


918 


100 


838 


U32865 


Drosophila 
melanogaster 


linotce protein 


164 


24 * 


839 


AF067730 


Homo sapiens 


TLS- associated protein TASR-2 


631 


56 


640 


U27831 


Homo sapiens 


stria turn- enriched phosphatase 


2840 


98 


841 


AF286366 


Homo sapiens 


CamKi-like protein kinase 


1796 


100 


842 


G02309 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: S390. 


278 


98 


843 


AE003615 


Drosophila 
melanogaster 


ade3 gene product 


113 


48 


844 


G01350 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5431. 


629 


100 ! 


845 


U27838 


Mus raus cuius 


glycosyl -phosphatidyl - 
inositol -anchored protein 
homolog 


3305 


"9* 


84 7 


Y87789 


Homo sapiens 


Human RBP-,26 protein. 


2026 . 1 


100 


848 


AF164794 


Homo sapiens 


Diff33 protein homolog 


2398 


100 


849 


U41315 


Homo sapiens 


ZNF127-Xp 


2458 


93 


850 


AF192784 


Homo sapiens 


maJcorin 1 


2062 


97 


851 


. Y58628 


Homo sapiens 


Protein regulating gene 
expression PRGE-21 . 


1548 


100 


852 


Z22968 


Homo sapiens 


M130 antigen 


6205 


100 


853 


Z22971 


Homo sapiens 


M130 antigen extracellular 
variant 


£380 


loo- 


8S4 


G03362 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443. 


330 


se 1 


855 


G03362 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443. 


203 


10O 


abb 


AF285118 


Homo sapiens 


CGI-203 


452 


100 


03/ 


ACOObObS 


Arabidopsis 
thai! ana 


putative cleavage and 
polyadenylation specifity 
factor 


1383 


55 


858 


AL021546 


Homo sapiens 


cytochrome c Oxidase 
Polypeptide via- liver 


593 


100 


859 


T02956 


Xenopus 
laevis 


nbonucleoorotein 


1664 


65 


860 J 


AF201947 


Homo sapiens 


MEK binding partner l 


616 


100 


861 


1*31783 


mus musculus 


uridine kinase 


1266 


92 


862 


AF161472 


Homo sapiens 


HSPC123 


602 


73 


663 


Z49068 


Caenorhabdit 
is elegans 


mitochondrial carrier protein 


370 


43 


664 


AF154108 


Homo sapiens 


tumor necrosis factor type l 


3559 


99 



164 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


! DESCRIPTION 
receptor associated protein " 


j SMITrf- 
WATERMAN 
SCORE 


IDENTITY 


865 


AE001530 


pylori J99 




230 


32 


866 


X57807 


Homo sapiens 


iuuuiuiuyauym in XaulDOa lignC 

chain 


699 


91 


867 
868 


~ AL031673 ~ 
Y11652 


Homo sapiens 
" Homo sapiens 


"~dJ694B14.1 ( PUTATIVE novel 

KRAB box protein with 18 C2H2 
type Zinc finger domains) 
phosphate cyclase 


4066 


99 


869 


AF192968 


Homo sapiens 


high-glucose-reguiated 
protein 8 


238 
3041 


100 
99 


870 


AB020648 






3237 


S9 


871 


AL031427 


Homo sapiens 


dd67A19.1 (novel protein) 


" 1608 


100 


872 


AF1 514 


Homo sapiens 


core histone macroH2A2.2 


1846 


100 


B73 
074 


AL021331 
VIA cnp 

axt DUO 


Homo sapiens 
Homo sapiens 


au3 6 6N23.i (putative €T. 

elegans UNC-93 (protein 1, 
C46F11.1) like protein) 
propionyi-CoA carboxylase 


" il29 


100 


875 


AL117334 


Homo sapiens 


dJ687Fll.i (novel protein 
(part of translation of cDNA 
DKFZp434N06l, Era:AL110249) ) 


3579 
306 


" 100 
100 


876 


X79489 


Saccharomyce 
a cerevisiae 


E-925 protein 


446 




877 
878 


YS3 001 

AP2 31064 


Homo sapiens 
Homo sapiens 


Human secreted protein clone 
dn834_l protein sequence SEQ 
ID NO: 8. 
CHMP1.5" 


811 

957 


100 


879 
880 


AF001317 


Saccharomyce 
s cerevisiae 


4 OS ribosomal protein Si 2 
Soilp 


687 
478 


100 i 
28 


881 
882 


Y~87275 
M14036 


Homo sapiens 


Human signal peptide 
containing protein HSPP-52 
SEQ ID NO:S2. 
CI -inhibitor 


2547 


100 


883 
884 


AB041261 

AF020in 


Homo sapiens 
Mus musculus 


calcium- independent " 
phospholipase A2 
proline -rich protein 48 


598 
2903 

~9~99 


77 
100 

84 


885 
886 


Y10936 1 
AF073997 


Homo sapiens 
Mus musculus 


hypothetical protein 
myotubularin related protein 
1 


1104 
866 


99 
36 


887 
886 


Y57B93 


Homo sapiens 
Homo sapiens 


Human transmembrane protein 
HTMPN-17. 

hypothetical protein 


1099 


94 


889 


AF210317 


Homo sapiens 


facilitative glucose 
transporter family member 
GLUT 9 


929 
2046 


99 
99 


890 


Y36031 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


583 


"100 


891 


Y36031 


noma sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


192 


57 


892 
893 


AF237631 
AF090929 


Homo sapiens 
Homo sapiens " 


ubiquitous tropomodulln U- 

Tmod 

PRO0477p 


1798 


100 


894 


AL031228 " 


Homo sapiens 


dJ1033B10.2 (WD40 protein 

BING4 (similar to s. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 


653 
3196 


99 
100 


895 
~896 


AL031228 
AF171102 


Homo sapiens 
Homo sapiens 


dJ1033B10.2 (WD40 protein 

BING4 (similar to S. 
cerevisiae YER0B2C, M. sexta 
MNG10 and C. elegans F28D1.1) 


2825 


96 


897 


&E003S51 


Drosophila " " 
melanogaster 


retinal degeneration B beta 
CG18176 gene product 


r302 
633 


9S 
33 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


" Tfc ~ 
IDENTITY 


898 


AJ237946 


Homo sapiens 


DEAD Box Protein 5 


2443 


100 


899 


Z97184 


Homo sapiens 


EKE 2 


624 


100 


900 


Z97104 


Homo sapiens 


EKE 2 


409 


98 


901 


AJ245587 


Homo sapiens 


Kruppel-type zinc £inger 


1942 


100 


902 


AF091034 


Homo sapiens 


GTP-binding protein RAB22A 


1011 


100 


903 


R95953 


Homo sapiens 


Eukaryotic cell growth 
inhibiting factor. 


414 


96 


904 


L04733 


Homo sapiens 


kinesin light chain 


1936 


72 


905 


AE003540 


Drosophlla. 
melanogaster 


CG10984 gene product 


446 


33 


906 


145554 2 


Homo sapiens 


ouanvlatfi bindina nrot^ in 
isoform I 




98 


907 


M55542 


Homo shdH pti q 


isoform I 


<SjU1 


3D 


908 


MOT UB Z3 


Homo c^ni one; 


WDProl. 


1007 


100 


909 


AF168676 


Homo 
sapiens 


TNF intracellular domain- 
interactir.g protein 


647 


100 


910 


AB029150 


Homo sapiens 


KRAB zinc finger protein 
HFB101L 


2196 


100 


911 


G02871 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6952. 


521 


100 


912 


G03162 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7243 . 


387 


87 


913 


AJ243721 


Homo 
sapiens) 
>Y92508 
Y92508 13- 
APR-2000 06- 
OCT-199B 
Human OXRE- 
5 . [Homo 
sapiens 


dTDP - 4 - keto - 6 -deoxy- D- glucose 
4-reductase 


1710 


100 


914 


U24189 


Caenorhabdit 
is elegans 


hypothetical protein 1207-1; 
Mecnoc: conceptual 
translation supplied by • 
authors 


244 

l 


41 1 






Homo sapiens 


A human progesterone receptor 
compjex p23-like protein. 


843 


99 


916 


AE000984 


Archaeoglobu 

e Fill /4tio 


dinitrogenase reductase 
activating glycohydrolase 
(draG) 


171 


26 


918 


M23159 


V^X XLC LUB 

cricetus 


wpk coaaipxiiieu prouem 


163 


30 


919 


L12018 


uacuui iicujui w 

is elegans 


m 1 1* A t* w 


1232 


*— 

41 


920 


AF102177 


Homo saDienc 


im\MUU-± an i> xy oil ojjtr^a^ 




0*7 


921 


AL096712 


Homo sauinnn 


diT744T24 2 (similar to a 

novel human gene mapping to 
Activator) 


1U1 / 


TO 

to 


922 


AL161495 


Arabidopsis 
thaliana 


putative WD~- repeat protein 


86* 


42 


923 


AL161495 


Arabidopsis 
thaliana 


putative WD-repeat protein 


442 


36 


924 


U97001 


Caenorhabdit 
is elegans 


similar to 

Schizosaccharomyces pombe 


605 


51 ' 


925 


X71978 


Mus musculus 


Fi£ 


1503 


95 


926 


K92288 


Drosophila 
melanogaster 


beta-spectrin 


290 


51 


927 


Y27575 


Homo sapiens 


Human secreted protein 
encoded by gene No. 9. 


1392 


100 


928 


Y22499 


Homo sapiens 


Human secreted protein 
sequence clone mh703_l. 


2249 


100 


930 


AJ224326 


Homo sapiens 


ribuiose-5-phosphate- 
epimerase 


912 


100 


931 


U28991 


Caenorhabdit 


coded for by C. elegans cDNA 


660 


55 



166 
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TABLE 2 



SEC 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


j DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








CTO21C7 






932 


" AL080O65 




hviT>r^7Vi At* "i /TJ 1 nvAh £±i r\ 

uy^»utfiBt ivclj. piocein 


210 


25 


933 


G013B4 




Human secreted protein, SEQ 
ID NO: 5965. 


767 


98 


934 


" AJ2764B5 




iitLegrai memDrane transporter 
protein 


1200 


100 


935 






dJ756G23.3 (novel protein 
similar to drosophila 
transcriptional repressor) 


1142 


80 


936 


AB026808 


Mus mus cuius 


synaptotagmin XI 


■'2142 ™ 1 


95 


937 


AB015345 


Homo sapiens 


HRIHFB2216 


2*01" ' 


99 


938 


X65724 


Homo sapiens 


0RF2 


498 


100 


"J 9 


u o q no A 
WH9U24 


Homo sapiens 


Polypeptide rragment encoded 
by gene 156 . 


1487 


100 


940 


G04047 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8128. 


117 - ■ 


loo 


941 

QA 


AF094583 


Homo sapiens 


putative HIV-i infection 
related protein 


452 


100 




AC024200 


Caenorhabdit 
is elegans 


contains similarity to 
several zinc finger proteins 
but not to the zinc finger 
domains 


350 


69 


943 


AF1297S6 


Homo sapiens 


G5C 


273 


100 


"944" 




Rattus 
norvegicus 


alpha- tropomyosin 


133 


96 


945 


arnnoqi'7 


Arabidopsis 
thaliana 


Contains similarity to 


5B3 


"47 


94 6 




Homo sapiens 


AD021 protein 


551 


44 


947 " 


AfoSS4"/5— 


Homo sapiens 


GAGE -8 


273 


51 


94 8 


X75756 


Homo sapiens 


protein kinase C mu 


2019 


68 


949 


AF143956 


Mus mus cuius 


coronin-2 


2300 


93 


950 


Y36729 


Homo 
sapiens 


Human PG1 protein sequence. 


1861 


99 


951 


W49041 = *~™ 


Homo sapiens 


Human low density lipoprotein 
binding protein LBP-2. 


282 


67 




vmn^ £ a ft i 

Asoi6 aei 


Arabidopsis 
thaliana 


gene_id:MXC17 .7- 


2 03 


4* 


oca 


Y01785 


Homo sapiens 


Human ubiquit in- conjugating 
enzyme >Y25341 Y2S341 01-JUL- 
1999 12-AUG-1998 Human NCE-2 
protein . 


3*5 

r 


100 


954 


AT HjoJ.3 


Drosophila 
melanogas tec 


BcDNA.GH03377 


823 


46 


95S 


U09410 


Homo sapiens 


zinc finger protein 2riFi31 


2483 


99 


"956 




Homo sapiens 


zinc finger protein ZNF131 


1853 


99 


957 


AF195623 


Homo sapiens 


cholinephosphotransf erase 1 
alpha 


2126 


99 


956 


X94917 


Drosophila 
melanogaster 


head-elevated expression in 
0.9 kb 


155 


32 


959 ■ 


U54807 


Rattus 
norvegicus 


GTP- binding protein 


1167 


97 


960 


AFO58807 


Bos taurus 


GTP-binding protein rah 


606 


97 


961 


G03244 


Homo sapiens 


ID NO: 7325. 


471 


100 


962 


rtf V /OOjU 


Homo sapiens 


steroid dehydrogenase homolog 


583 


40 


"963 


AP00175-4 


Homo sapiens 


transient receptor potent ial- 
related channel 7, a novel 
putative Ca2+ channel protein 


317 


30 


964 


AL035419 


Homo sapiens 


dJ1100H13.1 (putative novel 
protein) 


1129 


100 


965 


X61381 


Rattus 
rattus 


interferon- induced protein 


202 


46 


966 


D38169 


Homo 
sapiens 


inositol 1, 4, 5-trisphosphate 
3 -kinase isoenzyme 


3278 


100 


967 


AL031432 


Homo 
sapiens 


dJ465N24.2.1 (PUTATIVE novel 
protein) (isoform 1) 


893 


100 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 


968 


U79275 


Homo sapiens 


unknown 


on 


100 


969 


"AJ011306 


Homo 
sapiens 


guanine nucleotide exchange 
factor (lona isoform) 


2752 


99 


970 


AF281134 


Homo sapiens 




J. lob 


100 


971 


U53336" 


Caenorhabdit 
is elegans 


weak similarity over a ohort 

ic^Auu \-w inyu&lil Heavy t-ila JLil 


536 


23 * 


972 


AC018749 


Leishmania 
major 


L8840 . 12 




53 


973 


AP188504 


Mus mus cuius 


LNV 


- , 

bfl4 


85 


974 


U25801 


Homo sap i ens 


ka^jL i-> iiiuiiiy protein 


ob2 


98 


975 


AP04 9523 


Homo sapiens 
1 


hunting tin- interacting 
ptoteiii nxtr/i/ ror 1JL 


1390 


97 


"976 


AF161530 




nolr\.lDZ 


1040 


100 


977 


G04020 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8101. 


626 


100 


97 8 




Homo sapiens 


ribosomal protein L17 isolog 


908 


10 0 


979 


U94991 


Xenopus 
laevis 


transcription factor XLM01 


795 


97 


980 


S7377EJ 


Homo sapiens 


calmitine; calsequestrine 


2029 


100 


981 


vo j» a d n 
X948B8 


Homo 
sapiens 


Human protein clone HP01462. 


2501 


100 


982 


AJ243191 


Homo sapiens 


heat shock protein 


827 


96 


981 


X65020 


Bos taurus 


PSST subunit of the NADH: 
ubiquinone oxidoreductasa 
complex 


964 


85* 


984 


AJ249207 


Rhodocoacus 
sp . AD4 5 


putative racemase 


351 


43 


965 


Z3O093 


Homo sapiens 


basic transcription factor 2, 
35 kD subunit 


1576 


99 


986 


AHO3083 5 


Homo sapiens 


contains two glutamine rich 
domains, three zinc- finger 
domains, and mat r in 3 
homologous domain 3 (MH3) 


4697 


99 


987 


AF227258 


Bos taurus 


RPGR- interacting protein- 1 


1262 


38 


□ DO 




Homo sapiens 


dJlD42Kl0.2 (supported by 
GENS CAN, FGENES and GENEWISE ) 


404B 


99 


QQQ 


JUjQ.i22.5o 


Homo sapiens 
■— , 


dJ1042K10.2 (supported by 
GENS CAN, FGENES and GENEWISE) 


2321 


99 


S90 


API AT 47 K 


Homo sapiens 


HbFC308 r 


448 


92 


992 


AF161426 


Homo sapiens 


HS PC308 


448 


92 


992 


_L b JL <k Z b 


Homo sapiens 


HSPC308 


453 


92 


993 


AL023859 


Schizosaccha 

romyces 

pombe 


trna- splicing endonuc lease 
subunit 


172 


42 


994 


AT.fl4 qfi^l 
r\Ju U 1 j O J 1 


jiouio sapiens 


dJ513M9.1 (novel Homeobox i 
domain protein) 


241 


47 


995 


AC005253 


Homo sapiens 


R26445 1 


902 


100 


996 


AP265206 


Homo sapiens 


M0G1 isoform A 


974 


100 






Pyrococcus 
abyss i 


sarcosine oxidase, subunit 
beta (soxB) 


195 


26 


998 ( 


msi wu j oil 


Drosophila 


BG:DS00941.3 gene product 


218 


SB 


999 ; 


W69343 


Homo 
sapiens 


Secreted protein of clone 


1340 


98 


1000 




noroo sapiens 


similar to bovine ADP/ATP 
translocase Tl mRNA with 
GenBank Accession Number 
M24102.1 


1543 


100 


1001 


Y73381 


Homo sapiens 


HTRM clone 1877278 protein 
sequence . 


1668 


100 


1002 


AF208844 


Homo sapiens 


BM-002 


426 


100 


1003 


AE004944 


aeruginosa 


hypothetical protein 


134 


35 - 


1004 


AL031431 


Homo sapiens 


dJ462023.2 (novel protein) 


2058 


100 


1005 


S45367 


Can is 
familiaris 


centractin 


1949 


100 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 




1006 


S45367 


Pan is 

familiar! 6 


cent rac tin 


1315 


98 


1007 


AB022158 


Mus 

musculus 


chaperonin containing TCP-1 
epsilon subunit 


2649 


96 


looe 


Y76332 


Homo sapiens 


Frayment of human secreted 
protein encoded by gene 38. 


1282 


97 


1009 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1010 


268218 


Caenorhabdit 
is elegans 


K01H12.1 


269 


67 


1011 


AB011414 


Homo sapiens 


Kruppel-type zinc singer 
protein 


1671 


58 


1012 


Z14000 


Homo sapiens 


RINGl 


2017 


100 


1013 


G02841 




Human secreted protein* SEQ 
ID NO: 6922. 


■ITT 


93 


1014 


AF145*59 


melanogaster 


BcDNA . GH1 0333 


1 ""3A A 


52 


1D15 


Y02860 




PyflCTTn^nh ftf human <~»r^v4»f"j»r-3 
fc *«gMit4»t \j ±. jjuiiia.il aCCtCbcu 

Drotein eTirnr^pH hv rtr*r\e* £c 


"C£A 


67 


1016 


Y02591 


Homo sapiens 


A human progesterone receptor 

COmolex o23-lika B**ofceir» 


772 


"97 


101 1 * 


Y99448 


Homo sapiens 


Human PR01759 (UNQ832) amino 
acid seauenrp SEO td no ••4*7 a 


2323 


100 


101B 


X67250 


Rattus 
norvegicus 


n— chi.cna.erin 


TtTR 

± i±\) 


97 


1019 


AF183417 


Homo 
sapiens 


micro tubul e- a s soc i a t e d 
proteins 1A/1B light chain 3 


631 


100 


1020 


AF164795 


Homo sapiens 


»taA tcguiatea protein janus-a 


674 


100 


1021 


AF19D625 


coturnix 




638 


96 


1022 


AL133363 


Arabldopsis 
thaliana 


putative protein 


155 


37" 


1023 


AB034912 


Homo sapiens 


WD-repeat like sequence 


2483 


100 


1024 


AY007091 




mammalian inositol 
(IP6K2) mRNA vith Ge 


2243 


100 


1025 " 


X69910 


Homo sapiens 


P63 protein 


2 958 


99 


1026" - 


U8073* 


Homo sapiens 


CAGP9 


1657 


100 


102* 


AB029333 ~ 


Halocynthia 
roretzi 


HrPET-1 " 


1046 


54 


1028 


AB032931 


Homo sapiens 


ubiqui tin- conjugating enzyme 
isolog 


1045 


iob 


1029 


G01797 


Homo sapiens 


ID NO: 5878. 


HA Q 

/lit 


98 


1030 


G01797 


Homo sapiens 


ID NO: 5878. 


it? 


98 


1031 


AF193795 


Homo sapiens 


vacuolar sorting protein 
VPS29/PEP11 


960 


100 


1032 


AJ222 968 


Mus musculus 


L-periaxin 


120 


30 


1033 


ZB1317 


Schizosaccha 

rotnyces 

pombe 


DNA2-NAM7 helicase family 
protein 


685 


31 


1034 


Y41519 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 75. 


1321 




103* 


AJ276004 


Mus musculus 


Paxneb protein 


1709 


77 


1036 


AF025459 


Caenorhabdit 
is elegans 


H14A12.3 gene product 


190 


30 


1037 


U37251 


Homo sapiens 


Description: KRAB zinc iringer 
protein; this is a splicing 
supplied by author 


196 


43 


1038 


W74580 


Homo * 
sapiens 


Human membrane protein 
BA0306. 


1921 


97 


1039 


U88173 


caenorhabdit 
is elegans 


weak similarity to 
Arabidopsis thaliana 
ubiquitin-like protein 8 


331 


80 
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TABLE 2 



SEQ"" 
ID 
NO: 


1 ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 

' SCORE 


% 

IDENTITY 


1040 


AF290204 




D0K1 




99 


1041 


Y9673 0 


Homo 
sapiens 


PROS39, a Costal-2 homologue. 


162 


22 


1042 


AF140683 








So 


1043 


AF151023 


Homo sapiens 


HSPC1B9 


1104 


100 


1044 


AF181631 


Drosophila 

rn f» "1 p n octp* qKpt* 


BcDNA. GH04 92 9 


204 


37 


1045 


Y77985 


Hnnift canzone? 
nuuiu sa^/^wSllo 


- — —j j 

Human coiieccin amino acid 
sequence . 


1940 


100 


1046 


AJ243972 


nuuiu sapiens 


6-phosphogluconolactoxiase 


1317 


100 


1047 




Homo sapiens 


ATP opecific succinyl CoA 
synthetase beta subunit 
precursor 


2324 


99 


1048 


AL034550 


Homo sapiens 


dJ1184F4.2 (novel protein 
similar to nucleolar protein 
4 (N0L4) (NOLP)) 


981 


92 


1049 


API 6"? H 0 K 

/VP ilDJ O & J 


Homo sapiens 


pre-B lymphocyte protein 3 


634 


100 


1050 


AF201949 


Homo sapiens 


60S ribosomal protein K30 
isolog 


868 


100 


1051 


AF190624 


Mus musculus 


mdgl-i 


236 


85 


1052 


AE003529 


Drosophila 
melanogaster 


CG6151 gene product 


160 


44 


1053 


G01191 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5272 . 


646 


98 • 


"TOSi 


AIA62756 


Neisseria 
meningitidis 


Glu-tRNA(Gln) 

amidotransf erase subunit A 


682 


44 


1055 


Kpi r> t ace? 
Ar XoJLobb 


Rattus 
norvegicua 


tRNA selenocysteine 
associated protein 


1525 


99 


1058 


TTQ OCA a 


Cfalaraydomona 
s 

reinhardtii 


Mrl9,000 outer arm dynein 
light chain 


244 


34 


1057 


AF159141 


Homo sapiens 


breast cancer metastasis- 
suppressor 1 


6*63 


53 


1058 


AF230929 


Homo 
sapiens 


keratinocyte annexin-like 
protein pemphaxin 


1710 


99 


1059 




Homo sapiens 


putative membrane protein 


1363 


100 


1060 


AF224263 


Heterodontus 
franc isci 


HoxDS 


742 


83 


10*1 


X63417 


Homo sapiens 


IRLB 


1037 


100 


1062 ' 


AL079345 


streptomyces 
coelicolor 
A3 (2) 


hypothetical protein j 

• 


143 


27 


1063 — 




Homo s ap i ens 


Human Hydrolase protein-10 
(HYDRL-10) . 


2547 


100 


1064 




Homo sapiens 


acetyl- CoA synthetase 


3493 


99 


106S 


Y133S6 


Homo sapiens 


Amino acid sequence of 
protein pru22i . 


1363 


100 


1066 


AC006153 


Homo sapiens 


similar to Aquirex aeolicus 
GTP-binding protein; similar 


*62 


"98 


1047 1 


Y18930 


Sulfolobus 

sol fatpr^ rn« 


hypothetical protein 


162 


29 


1068 


R65969 


Homo 


Glioblastoma -derived 

poiypeptiae . 


887 


100 


1069 


Y07964 




fragment 


863 


96 


1070 


AF177476 


Rattus 
norvegicus 


CDK5 activator-binding 
protein 


1995 


86 


1071 


AF245505 


Homo sapiens 


adiican 


3109 


99 


1072 


U92794 


Mus musculus 


alpha glucogidase 11, beta 
subunit 


147 


36 


1073 


003889 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7970. 


69B 


98 1 


1074 


U15779 


Homo sapiens 


p70 


380 


28 


107* — ■ 


Y13392 


Homo sapiens 


Amino acid sequence of 


1271 j 


91 1 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBBR 


SPECIES 




SMITH- 

WATERMAN 
SCORE 


% 

IDENTITY 














1076 


AF161457 


Homo sapiens 


HSPC339 


571 


100 


1077 


Y79509 




Human earnonvrfrahp- aeanr^ b i-ctA 

protein CRBAP-5. 


2151 


9 8 


1078 


AF223466 


" Homo sapiens 


HT01S protein 


""ST? 


o o 


1079 


AL13296S 


Arabidopsis 
thaliana 


putative WD- 40 repeat-protein 


286 


29 


1086 


AB024937 


Homo sapiens 


LUNX 


XZ. 01 


1UU 


10B1 


Y14768 


Homo sapiens 


V-ATPase G-subunit like 


579 


100 


1032 


AF016416 


Caenorhabdit 
is elegans 


F2SA7.4 gene product 


141 


31 


10B3 


L13291 


Homo a no jpnq 


tMjv - rioo3yj.arginine hydrolase 


802 


45 


1084 


AB041541 


Mus musculus 


unnamed protein product 


151 


44 1 


1085 




Homo sapiens 


Human secreted protein, SEQ 

1JJ NO: 6003 , 


202 


97 


1086 




Homo sapiens 


H-REV107 protein homo log 


833 


100 


1087 


API m C"1B 


Homo sapiens 


phosphatidylcholine transfer 
protein 


1142 


100 


1088 




Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


2783 


100 


1089 


Y94867 


Homo 
sapiens 


Human protein clone HP10563. 


613 


100 




AJVU2 3 302 


Homo sapiens 


unnamed protein product 


130 


49 


1091 


AB041586 


Mus musculus 


unnamed protein product 


1103 


81 " 


1092 


Y71277 


Homo sapiens 


Human Zlipo3 protein. 


606 


100 


1093 


D34973 


Mus musculus 


protein tyrosine phosphatase - 
like 


1131 


95 


1094 
— — 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PR0828. 


522 


56 


1095 


Y87276 


Homo sapiens 


Human signal peptide 
containing protein HSPP-53 
ZD NO: 53. 


1029 


99 


"1096 


Y87276 


nouio sapicus 


Human signal peptide 
containing protein HSPP-53 
SSQ ID NO: 53. 


863 


98 


1097 


AF161455 






742 


98 i 


1098 


U80029 


Caenorhabdit 
is elegans 


similar to thioredoxin 


242 


39 * " 


1099 




Homo sapiens 


Sqv- 7 -like protein 1 


1321 


99 


1100 


AJ00586S 


Homo sapiens 


Sqv-7-2iJce protein 


" 1118 


99 


iioi 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


891 


99 


1102 




Homo sapiens 


Sgv-7-like protein 


1016 


99 


1103 


AT.n nodi 


Homo sapiens 


hypothetical protein 


299 


31 


1104 


AF242194 


Drosophila 

rani anrvrac*f 

rae x onoyas tel. 


brakeless-B 


147 


52 


1105 


AL031010 " 


Homo sapiens 


dJ422F24.1 (PUTATIVE novel 
protein similar to C. elegans 
C02C2.5) 


968 


100 


1106 


U2 8016 




prtiatxiiun uyaroiase 
(phosphodiesterase) -related 

piULCJUl 


1624 


87 


1107 


AJ27B150 


Homo sapiens 


mitahitfp 1 mi/1 lfinacio 
^uLauxvc 11pm jixilase 


- 

2207 


99 


1108 


G03733 


Homo sapiens 


Human secreted protein, SEQ 


495 


98 


1109 


AF217287 


Drosophila 
melanogaster 


G protein RhoBTB 


834 


54 


1110 


Y2B921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


941 


48 


1111 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


1331 


51 


1112 


AF176704 


Homo sapiens 


F-box protein FBX9 


2027 


99 


1113 


AF182074 


Homo 
sapiens 


glioma tumor suppressor 
candidate region protein 2 


2418 i 


100 


1114 


G04039 




Human secreted protein, SEQ 


475 


96 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 




SMITH- 
WATERMAN 
SCORE 


IDENTITY 








- id fro; 6i 20 , " — 






1115 * 


AF22943 9 


Mus musculus 


zinc finger protein 289 


1697 


91 


1116 


L40357 


Homo sapiens 




JUS 


100 


1117 


L4D357 


Homo sapiens 


thyroid receptor interactor 


404 


85 


111B 


A12155 




nUluau CVf<n . 


1673 


100 


1119 


AL161542 


thaliana 


iHOiRCioHc xxkc proccin 


607 


53 


1120 




no mo bapiene 


UU ^ / iXj.Lt> . JL \KaC 

LdAT / LdxiRouuxxn uepenuenc 

C L. LflCllJ nlildoc J-iJ.i\Jl> yi UCcXu/ 


2341 


98 


1121 


Y57901 


Homo naoipnc 


niuiiaji t.x taxifamertijui cine procciri 
ETMPN-25. 


321 


36 


1122 


214 122 


Acnopub 
laevis 


vr ft o 


455 


77 


1123 


AP225418 


Homo sapiens 


lipase 


1531 


97 






Homo s apie n 5 


Zen GTPase interacting 
protein ZIP. 


3227 


100 


1125 


AL035690 


Homo sapiens 


dJ202I2i.i (novel protein) 


952 


100 


1 i oc 


AUUUU^l / 


Homo s ap ie ns 


CLIC2 


1286 


99 


1127 


AB03O5O5 


Mus musculus 


UBE-lc2 


1069 


74 


1128 


^733^ 5" 


Homo sapiens 


HTRM clone 1427838 protein 
sequence . 


874 


100 


1129 


Y78941 


Homo sapiens 


Cyclophilin-type pep t idyl 
prolyl cis/trans isomerase 
amino acid sequence. 


877 


100 


1130 


AL0235S3 


Homo sapiens 


dJ347H13.4 [novel protein) 


557 


100 


1131 

. _ , ,. . 


Y91945 


Homo sapiens 


Human chaperone protein 6 
(HCHP-6) . 


1408 


100 


1132 


368197 


Schizosaccha 

romyces 

pombe 


putative nuclear pore protein 


596 


39 


1133 


Z6B197 


Schizosaccha 

romyces 

pombe 


putative nuclear pore protein 


389 


35 




/Vr loObal 


Homo sapiens 


guanine nucleotide exchange 
factor 


3597 


100 






Mus musculus 


enhancer of polycomb 


264 


41 


1136 


M62419 


Mus musculus 


clathrin-associated protein 


2189 


99 


1137 


AO 00 6219 


Drosophila 
melanogaster 


clathrin- associated protein 


1254 


78 


1136 


Y7621B 


Homo sapiens 


Human secreted protein 
encoded by gene 95 . 


440 


98 


1139 


Wodjl04 


Homo 
sapiens 


A Rab protein designated 
HRABS-2. 


1065 


99 


1140 


VI 14(11 


Homo sapiens 


Amino acid sequence of 
protein PR0339. 


3979 


98 






cmmeric - 
Homo sapiens 


Green fluorescent protein- 
Zap70 fusion product . 


3309 


100 


1142 


¥134*02 I 


Homo sapiens 


Amino acid sequence of 
protein PR0310 . 


1694 


99 


1143 


Q03875 


Homo s ap i ens 


Human secreted protein, SEQ 
ID NO: 7956. 


660 


99 


1144 


Y12917 




Amino acid sequence of a 
human secreted peptide. 


750 


98 


1145" 


Y12917 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 


1096 


100 


1146 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMQLOG 
{ PROTEIN DXF34 ) ) 


1233 


100 


1147 


AL022157 


Homo sapiens 


SPIN (SPINDLIN H0M0L0G 
(PROTBIN DXF34) ) 


1233 


100 


1148 


G02548 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6629. 


370 


93 


1149 


Y73338 


Homo sapiens 


HTRM clone 2019742 protein 
sequence . 


1492 


100 


1150 


W74B41 


Homo sapiens 


Human secreted protein 
encoded by gene 113 clone 


228 


55 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECiES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








HEAAR60 . 






1151 


AF044201 


Rattus 
norvegicus 


neural membrane protein 35; 
NMP35 


1570 


92 


1152 


AF1S^774 


Homo 
sapiens 


lysophosphatidic acid 
acyl t ra as f e r as e -gamma 1 


1865 


99 


1153 


AL118501 


Homo sapiens 


dJ1191Nl6.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em:AL050069) ) 


872 


64 


1154 


AF131852 


Homo sapiens 


Unknown 


473 


100 


1155 


Y41705 


Homo 
sapiens 


Human PR0352 protein 
sequence . 


13B1 


97 


1156 


G04036 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8117. 


607 


99 


1157 


AF112444 


Lupinus 
luteus 


L- a spar agin as e 


287 


43 


1158 


AF15184B 


Homo sapiens 


CGI- 90 protein 


232 


32 


1159 


AJ272267 


Homo sapiens 


choline dehydrogenase 


2449 


100 


1160 


AB001773 


Ciona 
savignyi 


PEM-6 


196 


33 


1161 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107. 


746 


83 


1162 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 

SEQ ID NO: 107. 


746 


-§3 


1163 


AF113 534 


Homo sapiens 


HP1-BP74 protein 


2723 


96 


1164 


AF232226 


Danio rerio 


Deddl 


191 


41 


1165 


AKL18501 


Homo sapiens 


dJH91N16.1 (A novel protein 
(translation of the cDNA 
DKF2Ip566A0946, Em:AL050QS9) ) 


1051 


71 


1166 


AL118501 


Homo sapiens 


djH9iNl^.l (X novel protein 
(translation of the cDNA 
DKFZp566A0946, Em:AL0S0069) > 


945 


75 


1167 


AF187733 


Homo sapiens 


syntaphilin 


831 


[ 42 


1168 


AB019435 


Homo sapiens 


phosphol ipase 


951 


55 


1169 


AF064604 


Homo sapiens 


KE03 protein 


324 


33 


1170 


Y01164 


Homo sapiens 


Polypeptide fragment encoded 
by gene 6. 


1191 


100 


1171 


L03188 , 


Saccharorayce 
s cerevisiae 


putative * 


180 


22 


1172 


AF113 751 


Mub musculus 


nuclear pore membrane 
glycoprotein POM210 


3941 


81 


1173 


AJ245417 


Homo sapiens 


05b protein 


794 


100 


1174 


AL022238 


Homo sapiens 


"&J1042K10.3 (novel protein) 


1285 


100 


1175 


U4127B 


Caenorhabdit 
is elegans 


F33Q12.3 gene product 


332 


28 


1176 


M35^17 


Homo sapiens 


T-cell receptor V-alpha-J- 
alpha region 


284 


83 " 


1177 


AC012680 


Arabidopsis 
thaliana 


putative protein phosphatase 
2C; 55455-56414 


209 


37 


1178 


G01345 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5426. 


692 


99 


1179 


AL096767 


Homo Bapiens 


"OJ579N16.3 (novel protein 
similar to worm, Arabidopsis 
and pine proteins) 


1342 


100 


1180 


AF039716 


Caenorhabdit 
is elegans 


6imilar to ATP synthase B 
chain 


496 


55 


1181 


Y11710 


Homo sapiens 


collagen type XIV 


1048 


97 


1182 


X82240 


Homo 
sapiens) 
>R94974 
R94974 09- 
MAY-1996 27- 
OCT-1994 
Human TCL-1 
polypeptide. 


T cell leukemia/lymphoma 1 


617 


100 
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SEQ 
ID 
NO : 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 






[Homo 
sapiens 








1183 


U42B41 


Caenorhabdit 
is elegans 


short region of weak 
similarity to collagen 


161 


33 


1185 


AJ131613 


Homo sapiens 


dicarboxylate carrier protein 


1470 


99 


1186 


L27645 


Danio rerio 


growth -associated protein 


130 


36 


1187 


Y02738 


Homo sapiens 


Human secreted protein 
encoded by gene 89 clone 
HLHFP03 . 


636 


100 


1188 


AF217544 


Xenopus 
laevis 


ornithine decarboxylase -2 


1459 


60 


1189 


AL136307 


Homo sapiens 


~clJ380BB.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


182 


33 


1190 


X89602 


Homo sapiens 


rTSbeta 


197 


100 


1191 


U32828 


Haemophilus 

influenzae 

Rd 


ribosomal protein S6 
modification protein (rimXJ 


268 


31 


1192 


AF154831 


Rattus 
• norvegicus 


PV-1 


1403 


60 


1193 


y*d92* 


Homo sapiens 


Human fetal brain cDNA clone 
vcl6_l derived protein. 


918 


100 


1194 


AF026530 


Rattus 
norvegicus 


stathmin-like-protein splice 
variant RB3 * » 


1093 


97 


1195 


U35244 


Rattus 
norvegicus 


vacuolar protein sorting 
homo log r-vps33a 


2981 


96 


1196 


Y70476 


Homo sapiens 


Human p53 target molecule, 
PRG3 protein. 


1680 


100 


1197 


AP157318 


Homo sapiens 


AD- 017 protein 


912 


47 


1198 


AF125443 


Caenorhabdit 
is elegans 


contains similarity to S. 
pombe phosphatidyl synthase 
(GB:Z2829S) 


460 


39 


1199 


AF201934 


Homo sapiens 


DC12 


1649 


88 


1200 


AL03177S 


Homo sapiens 


dj30M3.3 (novel protein 
similar to C. elegans 
Y63D3A.4) 


1902 


100 


1201 


M21103 


Ovis aries 


BIIIB4 high-sulfur keratin 


484 


62 


1202 


Z85986 


Homo sapiens 


dJl0BK11.3 (similar to yeast 
suppressor protein SRP40) 


1143 


75 


1203 


U18762 


Rattus 
norvegicus 


retinol dehydrogenase type I 


890 


52 


1204 


U35730 


Mus mus cuius 


jerky 


2235 


76" 


12 05 


AB002327 


Homo sapiens 


K*AA0329 


151 


24 


1206 


AB019233 


Arabidopais 
thaliana 


ubiquinone /me naquinone 

biosynthesis 

methyl transferase- like 


762 


56 


1207 


AL13 6307 


Homo sapiens 


OJ380B8.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


742 


100 


1208 


AF2079B9 


Homo sapiens 


orphan G-protein coupled 
receptor 


2326 


100 | 


1209 


Z97630 


Homo sapiens 


dJ466N1.4 (novel protein 
similar to ANK3 (ankyxin 3, 
node of Ranvier (ankyrin 
G))) 


181 


44 


1210 


U21549 


Mus mus cuius 


Ac 3 9/physophil in 


1280 


66 


1211 


Y27700 




Human secreted protein 
encoded by gene No. 12. 


1267 


100 1 


1212 


AF117814- 


Mus musculus 


odd- skipped related 1 protein 


945 


66 


1213 


AF277233 


Naegleria 
fowleri 


calcineurin B 


222 


39 


1214 


D14B49 


Mus musculus 


meiosis-specif ic nuclear 
structural protein l 


19S0 


77 


1215 


GO3022 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7103. 


590 


100 


1216" 


Z72510 


caenorhabdit 


similarity to yeast UTR3 


634 


49 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


— $ 

IDENTITY 






is elegans 


protein (Swiss Prot accession 
yk677hll.5 comes from this 
gene 






1217 


Z49703 


Saccharorayce 
6 cerevisiae 


unknown 


134 


22 


1218 


AC013430 


Arahidopsis 
thaliana 


F3F9.18 


199 


29 


1219 


L1091O 


Homo sapiens 


splicing factor 


1026 


71 


1220 


Z707SO 


Caenorhabdit 
is elegans 


similar to vanadate 
resistance protein 
transmembranous comes from 
this gene 


965 


58 


1221 


AL163815 


Arahidopsis 
thaliana 


putative protein 


653 


61 


1222 


AF155100 


Homo sapiens 


zinc finger protein KY-ren-21 
antigen 


2261 


io"6" 


1223 


J05071 


Bos taurus 


GTP-binding regulatory 
protein gamma-6 subunit 


356 


100 


1224 


Y73364 


Homo sapiens 


HTRM clone 2765991 protein 
sequence . 


1169 


99 


1225 


AL050170 


Homo sapiens 


Hypothetical protein 


714 


100 


1226 . 


X64002 


Homo sapiens 


RAP74 


2fUl 


99 


1227 


XO4085 


Homo sapiens 


catalase 


2046 


100 


1228 


AJ005620 


Mus musculus 


skeletal muscle- specific gene 


1416 


90 


1229 


"AF045564 


Rattus 
norvegicus 


development -related protein 


1715 


93 


1230 


X97S71 


Mus musculus 


HCMV- interacting protein 


479 


9(^~ 


1231 


L0B239 


Homo sapiens 


located at OATL1 


2274 


100 


1232 


AP121863 


Homo sapiens 


sorting nexin 14 


1964 


100 


1233 


AF121863 


Homo sapiens 




1 2 OS 


84 


"1234 


AC024805 


Caenor habd i t 
is elegans 


contains simi 1 aH hv 
TR:O04595 


/44 


31 


1235 


AC00*634 


Caenorhabdit 
is elegans 


contains similarity to 
Saccharomyces cerevisiae 
probable membrane protein 
YLR418C {GB:U20162) 


ic 7 




1236 


Y18101 


Mus musculus 


macrophage actin-associated- 
tyros ine -phosphoryl ated 
protein P 


"1559 — 




o / 


1237 


AB042646 


Homo sapiens 


TGIF2 


1224 


100 


1238 


AB026264 


Homo sapiens 


IMPACT 


1694 


xuu 


'1239 


AB026264 




IMPACT 


1123 


100 


1240 


G00429 


Homo sapi ens 


Human secreted protein, SEQ 
ID NOi 4510. 


324 




1241 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1363 


53 


1242 


AL03 56 02 


Arabidopsis 
thaliana 


putative protein 


499 


28 


1243 


X"V6483 


Gallus 
gallus 


Yes-associated protein 
(65kDa) 


574 


48 


1244 


AF220186 


Homo sapiens 


uncharacterized hypothalamus 
protein HT012 


503 


100 " 


1245 


AL021453 


Homo sapiens 


dJ821D11.3 (PUTATIVE protein) 


856 


ioo ■ 


1246 


AJ2760O3 


Homo sapiens 


GAR1 protein 


1216 


100 1 


1247 


YS7910 


Homo sapiens 


Human transmembrane protein 
HTMPN-34 . 


1363 


98 ' 


1248 


AC004874 


Homo sapiens 


similar to N- f 
acetylgalactosaminyl transfer a 
se; similar to Q07537 
(PID:gll71989) 


957 


100 


1249 


AF199597 


Homo 
sapiens 


A- type potassium channel 
modulatory protein l 


1139 


100 


1250 


Y1314B 


Rattus 
norvegicus 


PAG60B 


1350 


88 


1251 


M24B52 


Rattus 
norvegicus 


neuron- specific protein PEP- 
19 


124 


46 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 




SMITH- 
WATERMAN 


IDENTITY 


1252 


AF14673 8 


Rattus 
norvegicus 


testis specific protein 


771 




1253 


G02725 


Homo sapiens 


Human secreted protein, SEQ 
10 NO: 66C6. 


419 


If / 


1254 


W44375 


Homo sapiens 


Human ubiqui tin- conjugating 
enzyme DolvDBDtide 


104 5 


99 " 


1255 


AC006538 


Homo sapiens 


BC41195 1 


831 


78 


1256 


_ AB004316 


Bos taurus 


trans fornylase 


" i gee """ 


88 


1257 


Z35094 


Homo sapiens 


SURF- 2 


" 1354 


97 


1258 


Y13362 


Homo sapiens 


Amino acid sequence of 
protein PR0214. 


2383 


100 


1259 


AC006014 




biraaiar to Krr transforming 
ptwcein, sinmar Co P14373 

\k AU , j 431 / ) 


1299 


100 


1260 


AC005099 


Homo sapiens 


match to AI222572 
(NID:g3804775) 


' 4*9 


100 


1261 


V00507 




coding sequence of DHFR (1 is 
1st base in codon) (561 is 
3rd base in codon) 


984 


100 


1262 


X15443 


Rattus sp. 


gamma-giutamyltranspeptlaase 
(AA 1-568) 


597 


32 


1263 


AP173B71 


Mus musculus 


neuronal PAS 3 


977 


' 94 


1264 


AF178983 


Homo sapiens 


Ras-associated protein Rapl 


433 


97 


1265 


Y70473 


Homo sapiens 


Human cyclic nucleotide- 
associated protein -l (CNAP- 
l) . 


2785 


99 


1266 
1S*7 


Y41738 
AF06I346 


Homo 

sapiens 

Mus musculus 


Human PR0541 protein 
sequence . 

Edpl protein — — 


1622 


100 


1268 
1269 


U9700£ 
AF233582 


is elegans 
Mus musculus 


C13F10.4 gene product 
GTPase Rab37 


1077 
154 


64 
23 


12 70 

12 71 
1272 


AF195951 

AL031177 
AF201933 


Homo sapiens 

Homo sapiens 
Homo sapiens 


signal recognition particle 
68 

dJ889M15.3 (novel protein) 
DC11 


942 
3127 

r 1150 


95 
98 

55 | 


12 73 
1274 

1275 


AF201933 
AL02171O 

AC004449 


Homo sapiens 
Arabidopsi s 
thaliana 
Homo sapiens 


DC11 

putative protein """ 
R33683 3 


650 
346 
348 


100 
98 

.-49 


12 76 


Y86295 


Homo sapiens 


nuuicui scuctcQ protein 
HL2AGB7, SEQ ID NO: 210. 


556 
1920 


100 
100 


1277 
1278 


Y71111 
S94421 


Homo sapiens 
Homo sapiens 


Human Hydrolase protein- 9 
(HYDRL-9) . 

T cell receptor eta-exon 


1576 


99 


1279 
1280 


Y66695 
AF161380 


Homo — 
sapiens 

Homo sapiens ' 


Membrane -sound protein 

PR01344 . 

KSPC262 


478 
1909 


100 
100 


"1281 


Y48610 


Homo sapiens 


********* * * wUg k+ LUUIUUL 

associated protein 71. 


772 
779 


100 
100 


12 82 
1283 


ACQ 15446 
AK024432 


Arabidopsls 
thaliana 
Komo sapiens 


Similar to AIG1 protein " 
FW00022 protein 


4 06 


35 


1284 
1285 


W9<Jl53 
AJ001019 


Homo sapiens 
Homo sapiens 


Human FADD- interacting 
protein (FIP) . 
ring finger protein 


403 
1825 


35 
81 


1286 
"1287 


AE003823 


Drosophila 
melanogaster 


CG13178 gene product — 


1301 
195 


100 
29 




AF178632 


Homo sapiens 


FEM-l-like death receptor 
binding protein 


3261 


100 


1288 


AC006033 


Homo 
sapiens 


similar to Mia* 54; similar to 
138027 (PID:g2135214) 


1195 


100 


1289 
1290 


AC00*033 " 
AB023811 


Uomo 
sapiens 
Homo sapiens 


similar to MLN 64; similar to 

138027 (PID:g2135214) 

TU3A 


668 
351 


93 
54 
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SEQ 
ID 

NO: 


ACCESSION 

TyTTTWrppTj 
Vi UrioCiK 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


t 

IDENTITY 


1291 


273424 


is elegans 




235 


36 


1292 


Y94B71 


sapiens 


nuaian protein Clone HPQ2551. 


1222 


100 


1293 


AF190425 




re t inoblas toma -associated 
protein RAP140 


489 


29 


1294 


G03656 


Homo sapiens 


Human secreted protein, SEQ 

ID NO: 7937 . 


536 


99 


1295 


AFl_33fi*7n 


Mus mus cuius 


ARL-6 interacting protein- 2 


367 


51 


1296 


AJ249735 


Homo sapiens 


claudin-6 


""1145 


100 J 




A3 / juU 


Escherichia 
coli 


pspE protein 


535 


100 


1298 


AF169284 


Homo sapiens 


LIM and cysteine-rich domains 
protein 1 


1997 


100 


1299 


U41023 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
yk61fl.3; coded for by C. 
ykl09h8 .5 


324 


7.9 


1300 


AB024523 

r* f% n /> ™ 


Homo sapiens 


basic kruppel like factor 


1206 


100 


1301 


X55989 


Homo sapiens 


eosinophil cationic-related 
protein 


737 


99 




AF007151 


Homo sapiens 


unknown 


1481 


100 


1303 


X52904 


•Escherichia 
coli 


open reading frame [AA 1-6^) 


359 


100 


■LJU4 


U19577 


Escherichia 
coli 


galactonate dehydratase 


242 


93 


1305 


"AF26650B " 


Mus mus cuius 


NELF protein 


1409 


97 


1306 


Y57901 


Homo sapiens 


Human transmembrane protein 
HTMPN-25. 


932 


100 


1 JO / 


U5B750 


Caenorhabdit 
io elegans 


similar to the mitochondrial 
carrier family 


365 


54 


130B 


AF044774 


Homo sapiens 


breakpoint cluster region 
protein 2 


2681 


99 


1309 


AL078593 


Homo sapiens 


dJ210Bl.l (KIAA06801 


267 


34 


1310 


X82693 


Homo sapiens 


E48 antigen 


620 


96 


1311 - 


"282263 


Caenorhabdit 
is elegans 


C47A4.1 


283 


35 


1312 


AF131218 


Homo sapiens 


chromosome 16 open reading 
frame 5 


1493 — 


100 


1313 


Y41763 


Homo 
sapiens 


Human PR0938 protein 
sequence . 


1636 


• 100 


1314 


AF196972 


Homo sapiens 


CTM24 protein 


2239 


100 




Ar 0 533 56 


Homo sapiens 


insulin receptor substrate 
like protein 


228 


'97 


1316" 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1969 


100 


1317 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


2442 


89 


1318 


AF153127 "' 


Gallu3 
gallus 


SAPK interacting protein 


1477 


83 


1319 
T320 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1651 


86 




XS6932 


Homo sapiens 


23 kD highly basic protein 


1044 


100 


1321 


AF174605 


Homo 
sapiensj 
>Y83086 
Y83086 09- 
MAR-2000 28- 
AUG-1998 F- 
box protein 
FBP-18. 
[Homo 
sapiens 


F-box protein Fox25 


467 


70 


"1322 


M61732 


Trypanosoma 
cruzi 


neuramini da se 


214 


24 


1323 


Y17013 


porcine 
endogenous 


pol 


304 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 






retrovirus 








1324 


AL13B655 


Arabidopsis 
thaliana 


putative protein 


1174 


37 


1325 


AJj1Joo55 


Arabidopsis 
thaliaaa 


putative protein 


946 


35 


1326 


AL13321S 


Homo sapiens 


bA108L7.2 (novel protein 
similar to rat tricarboxylate 
carrier) 


1322 


99 


1327 


AF161541 


Homo sapiens 


HSPC056 


1357 


99 


1328 


Y73346 


Homo sapiens 


HTRM clone 619699 protein 
sequence . 


785 


96 


1329 


L10910 


Homo sapiens 


splicing factor 


912 


82 


1330 


AF146568 


Homo sapiens 


MIL1 protein 


1936 


100 


1331 
1332 " " 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


232 - 


39 


Y41741 


Homo 
sapiens 


Human PRO704 protein 
sequence . 


1860 


100 


1333 


AF295096 


Homo sapiens 


zinc -finger protein ZBRK1 


411 


91 


1334 


Z82271 


Caenorhabdit 
is elegans 


Similarity to Mouse kinensin- 
like protein KI?4 comes from 
this gene 


578 


44 


1335 


AEO0081O 


Nethanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


290 


43 


1336 


Y68779 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-11. 


1019 


91 


1337 


AB027003 


Mus musculus 


protein phosphatase 


378 


84 


1338 


064856 


caenorhabdit 
is elegans 


weak similarity to TPR 
domains 


215 


46 


1339 


AE001394 


Plasmodium 
falciparum 


protein of the YMR7 family 


170 


29 


1340 


X76717 


Homo sapiens 


MT-1JL protein 


204 


89 


1341 


AC011914 


Arabidopsis 
thaliana 


putative mutT protein; 68398- 
67B81 


289 


45 


1342 


AJ276171 


Homo sapiens 


ASPIC 


2122 


100 


1343 


AF187016 


Homo sapiens 


myosin regulatory light chaiir 
interacting protein MIR 


2303 


99 


1344 


AC006963 


Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 
(PID:g4650B44) 


894 


M 


1345 


AF2S7466 


Homo sapiens 


N-acetylneuraminio acid 
phosphate synthase 


1880 


99 


1346 


Y25B96 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
64. 


114B 


100 


1347 


AJ272073 


Torpedo 
marmorata 


male sterility protein 2 -like 
protein 


1664 


58 


134 8 


AF161548 


Homo sapiens 


HSPC063 


1018 


98 


1349 


W78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96. 


1117 


100 


1351 


G02144 


Homo sapiens 


Human Becreted protein, SEQ 
ID NO: 6225. 


418 


100 


1352 


D90869 


coli 


c*AUl-lXttJC tu 


2047 


100 


1353 


A12029 


Homo sapiens 


MRP- 14 


613 


100 


1354 


AC00532B 


Homo sapiens 


R2 6660_1, partial CDS 


870 


74 


1355 


AC024876 


Caenorhabdit 
is elcgan3 


contains similarity to 
SW : RPB1_CRIGR 


829 


61 


1356 


AF077226 


Homo sapiens 


copine III 


1876 


64 


1359 


AF217188 


Mus musculus 


YIP1B 


801 


63 


13*0 


AC074331 


Homo sapiens 


ZNF234 


3B69 


100 


1361 


Ai.163279 


Homo sapiens 


homo log to cAMP response 


5035 


99 
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V 
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element binding and beta 
transducin family proteins 




. 


1362 


Z48475 


Homo sapiens 


glucokinase regulator 


3160 


99 


1363 


Z48475 


Homo sapiens 


glucokinase regulator 


2682 


97 


1364 


AF195764 


Homo sapiens 


megakaryocyte- enhanced gene 
transcript 1 protein; MEGT1 
protein 


2055 


99 


1365 


AF116609 


Homo sapiens 


PRO0915 


581 


100 


1366 


AF116609 


Homo sapiens 


PRO0915 


581 


100 


136^ 


AL117352 


Homo sapiens 


~dJ876B10.3 (novel protein 
similar to C. elegans 
T19B10.6 (Tr:Q22S57)) 


2581 


99 


1368 


Y34124 


Homo 
sapiens 


Human potassium channel 
K+Hnovl5 . 


1342 


100 


1369 


AJ245621 


Homo sapiens 


CTL2 protein 


372B 


99 ] 


1370 


AF008220 


Bacillus 
subtilis 


YtaG 


429 


45 


1371 


X0S562 


Homo sapiens 


alpha -2 chain precursor (AA - 
25 to 1018) (3416 is 2nd base, 
in codon) 


5908 


99 


1372 


Z9804B 


Homo sapiens 


CU408N23.4 (novel DnaJ domain 
protein) 


1296 


99 


1373 


AF154415 


Homo sapiens 


FLASH 


10253 


100 


1374 


U20286 


Rattus 
norvegicus 


lamina associated polypeptide 
1C 


1567 


69 


1375 


U53445" 


Homo sapiens 


DOC1 


1645 


46 


1376 


AL117337 


Homo 
sapiens 


bA3 93J16.1 (zinc finger 
protein 33a (KOX 31) ) 


250 


60 


1377 


AC005326 


Homo sapiens 


R26660_l, partial CDS 


1126 


100 


1378 


U35113 


Homo sapiens 


metastasis-associated gene 


1823 


69 


1379 


1.15313 


Caenorhabdit 
is elegans 


putative 


858 


58 


1380 


Y25756 


Homo sapiens 


Human secreted protein 
encoded from gene 46. 


1508 


100 


1381 


AB037360 


Homo sapiens 


ANKHZN 


5734 


9S 


1382 


AB037360 


Homo sapiens 


"ANKH2N 


959 


97 


1383 


AF237676 


Mus muBCulus 


G beta- like protein GBL 


1721 


96 


1384 


AF237676 


Mus musculus 


G beta- like protein GBL 


1043 


70 


13 85 r 


Y58793 


Homo sapiens 


••Human calcium regulatory 1 
p ro t e in CaREG- 1 . 


715 


100 


1386 


AF212162 


Homo sapiens 


ninein 


10369 


99 


1387 


AL031685 


Homo sapiens 


dJ963K23.2 (novel protein) 


337 


33 


13 88 


AC004B90 


Homo sapiens 


similar to zinc finger 
proteins; similar to BAA24380 
>W06316 W06316 03 -OCT- 1996 
27-APR-1995 TRP-1 protein. 


542 


86 


13 89 


AP1B7989 


Homo sapiens 


zinc finger protein ZNF223 


2665 


99 


1390- 


AC035150 


Homo sapiens 


Zinc finger protein ZNF221 


3459 


100 


1391 


AF2B7894 


Homo Bap i ens 


PIST 


1410 


97 


1392 


AF28226S 


Homo sapiens 


Inner centromere protein 
INCENP 


1794 


99 


13 93 


X90840 


Homo sapiens 


axonal transporter of 
synaptic vesicles 


4 584 


99 


1394 


AF076249 


Homo sapiens 


zinc finger protein SBBIZ1 


3208 


99 


1395 


G02224 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6305. 


299 


75 


13 96 " " 




Arabidopsis 
thaliana 


Similar to 


130 


34 


13 98 


AF242519 


Homo sapiens 


zinc finger protein SBZF3 


181 


66 


13 99 


AL133396 


Homo 
sapiens 


dJl068H6 . 4 (prion protein 
like protein doppel) 


962 


100 


1400 


VT4B611 


Homo sapiens 


Human breast tumour- 
associated protein 72. 


817 


99 


1401 


AC004472 


Homo sapiens 


PI. 11659 5 


280 


54 


1402 


X91489 


Saccharomyce 
3 cerevisiae 


putative HMG box 


164 


27 
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1403 


Y79222 


sapiens 




2842 


100 


1404 


X8105B 


Mus musculus 


tex26i 


1010 


99 


1405 


AB012084 


nuo UIUBV.U1US 




194 


29 


1406 


AB030251 


Homo sapiens 


GTPase activating protein 


3233 


99 


1407 




Ka t LUS 

rattus 


fia-iiKe protein 


2684 


99 


1408 


" X75760 


r^^^orxT^Vi lis 

uxosopm jlo 
melanogaster 


LiKKh 1 


364 


29 


1409 


V /box O 


Mus musculus 


N-RAP 


804 


48 


14 10 




Homo sapiens 


F20BB7_1, partial CDS 


835 


63 


1411 


AE000284 


Escherichia 
coli 


orf, hypothetical protein. 


360 


" 100 


1412 


X01563 


Escherichia 
coli 


L5 (rplE) (aa 1-179) 


911 


100 


1413 


W7B279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 


1414 


AB031051 


Homo sapiens 


organic anion transporter 
OATP-E 


3832 


100 


1415 


M17466 


Homo sapiens 


coagulation factor XII 


3455 


100 


1416 


AF097994 


Homo 
sapiens 


L - Jcynuren i ne /alpha - 
aminoadipate aminotransferase 


2202 


99 


1417 


AF151077 


Homo sapiens 


HSPC243 


1262 


99 


1418 


Y09945 


Rattus 
norvegicus 


putative integral membrane 
transport protein 


1098 


61 


1419 


U13152 


Mesocricetus 
aura t us 


guanine nucleotide-binding 
protein beta 5 


2179 


76 


1420 


AL162458 


Homo sapiens 


hA465L10.5 (XIAA117tf {novel 
protein, presumed ortholog 
of mouse K-Cl cotransporter 
KCC2 ) ) 


5696 


100 


1421 


Y99426 


Homo sapiens 


Human PRO1604 (X3NQ785) amino 
acid sequence SEQ ID NO: 308. 


152 


29 


1422 


Y94923 


Homo sapiens 


Human secreted protein clone 
qsl4_3 protein sequence SEQ 
ID NO: 52. 


4039 


99 


1423 


AF177388 


Homo 
sapiens 


cancer-amplified 
transcriptional coactivator 
ASC-2 r 


10748 


99 


1424 


Y48517 


Homo sapiens 


Human breast tumour - 
associated protein 62. 


1851 


99 


1425 


AF208848 


Homo sapiens 


BM-006 


14 54 


89 


uoc 


At 2Doo9o 


Homo sapiens 


BM-006 


853 


79 


11<J / 


AF112886 


3os taurus 


differentiation enhancing 
factor 1 


4693 


95 


1428 


U41387 


Homo sapiens 


Gu protein 


1372 


63 


1429 


API 6 153 4 


Homo sapiens 


HSPC049 


2853 


78 


1430 


AP125043 


Mus musculus 


bisphosphate 3 ' -nucleotidase 


275 


30 


1431 


Y66718 


Homo 
sapiens 


Membrane-bound protein 
PR01106. 


1886 


100 


1 All 


Ar iyj blj 


Homo sapiens 


cell recognition molecule 
Caspr2 


568 


100 


1433 ( 


AB044560 


Mus musculus 


Gliacolm 


192 


34 


1 CIA 




Homo sapiens 


NTII-l nerve protein, 
facilitates regeneration of 
nerve cells. 


707 


51 


1435 


AF220530 


Homo sapiens 


cnyo- inositol l-phosphate 
synthase Al 


2904 


100 


1436 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


1261 


72 


1437 


AF271732 


Homo sapiens 


bridging integrator-3 1 


1282 


100 


1438 


Y30B11 


Homo sapiens 


Human secreted protein 
encoded from gene 1. 


595 


98 


1439 


AJ293659 


Homo sapiens 


mucol ipidin 


628 


97 


1440 


AF21913 8 


Homo sapiens 


GGA3 long isoform 


3083 


100 


1441 


AF219138 


Homo sapiens 


GGA3 long isoform 


3346 


100 
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14 42 


AB039669" 


Homo sap leas 


ALEX3 


1944 


100 




1443 


AP237711 


Drosophila 
melajiogaster 


Diablo 


'131 


27 


1444 


AJ011096 


Homo sapiens 


Nafl beta protein 


439 


39 




1445 


X73874 






" 6233 




1446 


' AF214114 




antigen BCAA 


" 3999 


99 


1447 


AF003924 


Homo sapiens 


ANC 2H01 


2645 


99 


144B 


AF003136 


Caenorhabdit 
is elegans 


contains weak similarity to 
an AMP-bindino moM f 


2843 


52 


1449 


AF155112 


Homo sapiens 


' NY -REN- 50 antigen "~ ~~ ~ 


1184 


89 


1450 


Y95004 




vc54 1, SEQ ID NO: 46. 




100 


1451 


AF107203 


Homo sapiens 


HkaAUl £> UJ.11UX11U UiUUC<lU 


-~ 

688 


57 


1452 


AF107203 


Homo sapiens 


ataxin 2 -binding protein 


456 


78 


1453 


ziaoi'i 


Pius niuov.ui.ub 




882 


56 


1454 


A7U3DO 


nuiiivj Qcipxens 


Protein sequence and 
annotation available soon via 
IABEIT©smbl- Heidelberg .DE 


510 


28 


1455 


&T.n 7 t\dn q 


Homo sapiens 


uo3d4M11.3 (similar to 
sialyltranferase) 


1356 


ioo 


1456 


D44480 


Mus musculus 


MATH- 2 protein 


272 


100 


1458 


API 41 


Homo sapiens 


KiiA neiicase HDB/D-CE1 


478 


4b 


1459 


AF242552 


Gall us 
gallus 


retinovin 


94 5 


34 


1460 


TT 1 1 ft ^ C 


Homo sapiens 


mm 


724 


84 


1461 


AB6252S8 


Mus musculus 


granuphilin-a 


545 


39 


1462 


Y08134 


Homo sapiens 


acid sphingomyelinase- like 
pho sphodi e s t e ras e 


2428 


99 


1 A C*X 




Homo sapiens 


match to ESTs Z43979 
(NID:g573097) , R19699 
(NTD :g774333 ) 


869 


98 




14*4 


AC0&4997 


Homo sapiens 


match to ESTs 243979 
(NID:g573097) , R19699 
(NID :g774333 } 


869 


98 


1465 


U32743 


Haemophilus 
influenzae , 
Rd 


fucose operon protein (fucU) 


315 


50 


1466 


Y09022 


Homo s ap i ens 


Not 56-1 ike pro t ein*" 


2342 


100* 


1467 


AC003034 


Homo sapiens 


Homolog of rat kidney- 
specific (KS) gene 


1072 


99 


1468 '■' 


AF071544 


Spinacia 
oleracea 
( 


ribulose-i, 5-blsphosphate 
carboxylase /oxygenase small 
subunit N- methyl transferase I 


333 


2<j 




ij r?JU 


Homo sapiens 


Human transmembrane protein 


1053 


100 


1470 


AP032666 


Rattus 
norvegicus 


rsec5 


4504 


93 




1471 


Y70467 


Homo sapiens 


Human membrane channel 
Drotein-17 (MPPHP-17V 


452 


74 


1472 


AL031033 


Homo sapiens 


C321D2.1 (Ribosomal Large 
Subunit Pseudouridine 
Synthase protein) 


1694 


100 


1473 


AF177292 


Hninn can H one 




4026 


98 


1474 


S45936 


Hnmn eanipna 


RlOJ. 


1101 


50 




1475 


Y8^241 


Homo nanipnq 


Human secreted protein 
HOABR60, SEQ ID NO: 156. 


1879 


98 


1476 


AJ010317 


Fugu 
rubripes 


Sand 


1278 


68 


1477 


U42831 


Caenorhabdit 
is elegans 


coded .for by C . elegans cDNA 
yk99b4.3; similar to human 
transforming protein 
(PIR:S22157) 


846 


44 


1478 


X62447 


Homo sapiens 


PR 264 


543 


61 




1479 


X82209 


Homo sapiens 


MN1 


7116 


100 


1480 


U10536 


Pan paniscus 


MHC. class I A ~~J 


675 


84 



181 



WO 01/53312 



TABLE 2 



PCT/USOO/34263 



3BQ 
ID 
NO: 


1 ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


oni lit — 
SCORE 


k 


1481 


AIi078599 " 


Homo sapiens 


dJ9SlC6.1 (novel protein 
similar to C. elegans 
F55A12.9 (Tr:P91086)) 


1274 


65 


1482 


Z98977 


" Schizosaccha 
romyces 
pombe 


putative vacuolar protein 


256 


29 " 


1483 


ABO05662 


Mus musculus 


JNK/SAPK-associated protein- 1 


4968 


92 


1484 


AL050120 


" Homo sapiens 


hypothetical protein 


716 




1485 


M27878 


Homo sapiens 


DNA binding protein 


1006 


53 


1486 


" " Y69161 


Homo sapiens 


Amino acid sequence of a 
partial protein kinase. 




39 


1487 


_ X8415S 


s cerevisiae 


ATH1 




29 


1488 


AF038963 


Homo s ap i ens 


RNA helicase 




34 


1489 


U56966 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 

v)c3 Db3 . 5 r coded f n-r hv C 
elegans cDNA yk30b3.3 


620 


42 


1430 


AE000989 


Archaeoglobu 
s fulgidus 


enoyl-CoA hydratase (fad-4) 


533 


""46 


1451 


M80633 


Rattus 
norvegicus 


adenylyl cyclase type IV 


707 


95 


1492 


V73342 


Homo sapiens 


HTRM clone 2709055 protein 

sequence . 


3513" " 


99 


1493 


Y17220 


Homo o ap i ens 


fj283-ll). 


462 


37 


14 94 


AF133670 


Mus musculus 


ARL-6 interacting protein-2 


701 


97 


1495 


Y94 897 


Homo 
sapiens 


Human protein clone HP10574 . 


1371 


106 


1496 
1497 


AL049699 
AF037447 


Homo sapiens 
Homo sapiens 


dJ747H23.2 {novel protein) ~~ 
ribosoraal S6 protein kinase 


1550 


100 


1496 


AL445067 


Thermoplasma 
acidophilus 


putative target YPL207w o£ 
the HAP2 transcriptional 
complex related protein 


2427 
269 


100 
35 


1499 
1500 


AB039947 
AJ2777S0 


Homo sapiens 
Homo Bapiens 


XllL-binding protein 51 
UBASH3A protein 


227 " 


3S 


1501 


AL050333 


Homo 
sapiens 


dJ93K22.1 (novel protein 
(contains DKFZP564B116) ) 


3509 
2439 


100 ] 
100 


1502 
1*03 ■ 


AF179896; 
AF17894~B 


Homo sapiens 
Homo sapiens 


TALE homeobox protein Meis2a r 




100 


1*04 
1505 


Y5300S 
X82494 


Homo sapiens 
Homo sapiens 


pm749_8 protein sequence 3EQ 

ID NO:16. 

fibulin-2 


1177 
1442 


100 
99 


1506 

-TSol — 


X98296 
AL034548 


Homo sapiens 
Homo sapiens 


ubiquitin hydrolase 
dJ1103G7.6 (novel protein) 


3580 

/ b J 


99 
42 


1508 
"1509 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 2 1 . 


109B 
1736 


100 
100 


"1510 


AF2201B2 


Homo sapiens 


uncharacterized hypothalamus 
protein HT008 


1181 


98 




Utj4t>01 


Caenorhabdit 
is elegans 


Gene probably begins in the 
next cosraid 


Ait; 


38 


1511 


AIi3 56192 " ■ 


Neurospora 
crassa 






Z9 


1512 " 

TSi3 


D17629 " " 
AF168717 


Homo 
sapiens 
Homo sapiens 


N- acetylgalactosamine 6 - 
sulfate oulfatase (GALNS } 
x 009 protein 


"1829 

694 


AUU 

99 " 


1514 
1515 


AJ243531 
AC003672 


Homo sapiens 
Arabidopsis 
tha liana 


nM15 protein 

putative C3HC4-type RING zinc 
finger protein 


735 
407 


100 
30 


1516 ■ 
"1517 


AF11543 5 


Rattus 
norvegicus 


syntaxin 17 


1374 


90 




AF003140 


caenorhabdit" 
is elegans 


C44E4.5 gene product 


274 


31 


"1518 

T5I3 


AB002584 
AL121764 


Rattus 

norvegicus 

Schizosaccha 


be t a - a 1 anine - pyruva t e 

aminotransferase 

yeast atp!2 protein precursor 


223 8 
270 


82 
30 
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romyces 
pombe 


homo log 






1S20 


AF255910 


Homo 
sapiens 


vascular endothelial 
junction-associated molecule 


547 


100 " ' 


1521 


D31764 


Homo sapiens 


KIAA0064 


" 170 


21 


1522 


Y66634 


Homo 
sapiens 


" Membrane-bound protein 
PRO190. 


985" 




1523 " 


Y944S0 


Homo sapiens 


Human inflammation associated 
protein 


250 


43 


1524 


' AC000107 


Arabidopsis 
thaliana 


F17F8.22 


277 


"~37 


1525 


AF109377 


Mus musculus 


ldlBp 


1277 


83 


1526 


AL031427 


Homo sapiens 


dJl67A19.4 (novel protein) 


1432 


99 


1527 


Y08135 


Mus musculus 


acid sphingomyelinase- like 
phosphodi est eras e 


1496 


1 Q 


1528 


AK024423 


Homo sapiens 


FLJO0O12 protein 


g"H '" 


100 


1523 


AF154S62 


Homo sapiens 


quiescent cell proline 
dipeptidaoe 


679 


100 | 


1530 


AF205598 


Homo sapiens 




iJbo 


100 


1531 


AF251039 ■ 


Homo sapiens 




1420 


50 


1532 


W74805 


Homo sapiens 


Human secreted protein 
encoded bv o^tia 77 r»l rvne* 
HOEAS24 . 


493 


*7 

• 


1533 


AP039023 




Ran-GTP hi nH*i ncr nrnhPin* 

RanBPS 


5707 


99 


1534 


AC007190 


Arabidopsis 
thaliana 


F23N19.9 


n a 


37 


1535 


AB027564 


Homo sapiens 


DINB1 


4482 


100 


1536 


Y36178 


Homo sapiens 


Human secreted protean 


3 77 


B7 


1537 


Y50907 


Homo sapiens 


Human ff»tal brain r>r)M2k r»1 nriQ 

.uuittmi ^v^^clu, U i alii i—±JVir\ " X C JTI ^ 

vb3_l derived protein. 


3 693 


99 


1538 


AF0173 68 


Mus musculus 


faciogenital dysplasia 
protein 2 


177 


47 


1539 


AF266756 


Homo sapiens 


sphingoaine kinase 


2011 


99 


1540 


Z48804 


Homo sapiens 


OA1 


2238 


100 


1541 


AF000195 


Caenorhabdit 
is elegans 


Contains similaricy to Pfam 
domain: PF00169 (PH) , 
Score=20 6 E-value-1 q p _ac 
N=l 

" T -r 


379 


42 


1542 
1S43 


V711S9 

X76092 


Homo sapiens 
Homo sapiens 


Human phosphodiesterase 
interacting protein, 
my o mega 1 in. 

DNA binding protein RFX3 


9415 


99 


1544 
1545 


AB015330 
AF198487- 


Homo sapiens 
Homo sapiens 


HR*HFB2007 

transcription factor LBP-lb 


3327 
631 


100 


1546 


AF016417 


Caenorhabdit 
is elegans 


Similar to BZIP transcription 
factor 


2822 
518 


100 

42 — 


1547 


X55B85 


Homo sapiens 


KDEL receptor 


110£ 


100 


154 8 
154 9 


AB03549S 1 " 
AL021707 


carassius 
auratus 
Homo sapiens 


ubiquit in- activating enzyme 
El 

dJ508U5.4 (KIAA0668) 


B36 


42 


1550 


AJ223978 


Bacillus 
subtilis 


YvqK protein 


3688 


100 

i 


1551 


AF145615 


Drosophila 
raelanogaater 


BcDNA.OH03377 


822 




1552 


AL157734 


Schizosaccha 

romyces 

pombe 


putative mannosyl trans £ erase 
involved in N-glycosylation 


435 


J7 


1553 


AF079S27 


Mus musculus 


Ier* 


691 


63 


1554 " 


AB026291 


Rattus 
norvegicus 


acetoacetyl-CoA synthetase 


1099 


88 


1555 


Y44722 


Homo sapiens 


Human immune system molecule, 
ISMO-3. 


1780 


99 


1556 
1557 


AF116553 
Y71056 


urosopniia 
melanogaster 
Homo sapiens { 


antennal -specific short-chain 
dehydrogenase/reductase 
Human membrane transport 


277 
1975 


32 
99 
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protein, MTRP-1. 






1558 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-1. 


1975 


99 


1559 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-l. 


1894 


97 


1560 


AF092050 


Mus rausculus 


beta-l,3-N- 

acetylglucosaminyl transferase 


262 


44 


1561 


AL109827 


Homo sapiens 


dJ309K20.2 (acrosomal protein 
ACR55 (similar to rat sperm 
antigen 4 (SPAG4 ) ) ) 


1607 


97 


1562 


AJ131890 


Homo sapiens 


DNA polymerase lambda 


3002 


100 


1563 


AL035424 


Homo sapiens 


dA22D12.1 (novel protein 
similar to Drosophila Kelch 
proteins) 


301S 


100 


1564 


AC00240O 


Homo sapiens 


Gene product with similarity 
to Ubiquitin binding enzyme 


2790 


100 


1565 


AC005306- 


Homo sapiens 


R27216JL 


919 


82 


1566 


AF000195 


Caenorhabdit 
is elegans 


Contains similarity to Pfam 
domain: PF00169 (PH) , 
Scoren20.6, E-value=l ,9e-05, 
N=l 


550 


45 


1567 


AB033281 


Homo 
sapiens 


F-box and WD -repeats protein 
beta-TRCP2 isoform C 


2879 


100 


iS£s 


D49473 


Mus naisculus 


truncated form of Soxl7 


1047 


78 


1569 


AX025270 


Homo sapiens 


unnamed protein product 


210 


91 


1570 


X75756 


Homo sapiens 


protein kinase C rau 


4797 


99 '"■ ""' 


1571 


AF145713 


Homo sapiens 


SCHiP-i 


2388 


100 


1572 


AEO03831 


Droeophila 
melanogaster 


CG18445 gene product 


180 


31 


1573 


AF074603 


S t rep t omyces 
griseus 
subsp. 
griseus 


NonF 


205 


38 


1574 


U28993 


Caenorhabdit 
is elegans 


F22D3.3 gene product 


144 


27 


1575 


AF129507 


Homo sapiens 


transcription factor ICBP90 


287 


68 


1576 


X64878 


Homo sapiens 


oxytocin receptor 


2002 


100 


1577 


AF237711 


Drosophila 
melanogaster . 


Diablo 


421 


54 


is^e 


G00975 


Homo oapiena 


Human secreted protein, SEQ 
ID NO: 5056. 


480 


100 


1579 


AF248744 


Cryptosporid 
ium parvum 


thrombospondin- related 
adhesive protein 


123 


33 


1580 


AL121782 


Homo sapiens 


dJ585I14.2 (novel protein 
(translation of cDNA 
Em:AK000219) ) 


£63 


100 


1581 


AF041853- 


Homo sapiens 


kinesin family member protein 
KIF3A 


345 


33 


1582 


AF025441 


Homo sapiens 


Opa- interacting protein OIP5 


ii9d 


106 


1583 


AE001803 


Thermotoga 
maritima 


glycerate kinase, putative 


349 


34 


1584 


AF252263 


Homo sapiens 


Kelch-like 1 protein 


3973 


100 


1585 


AF169675 


Homo 
sapiens 


leucine- rich repeat 
transmembrane protein FLRTl 


3494 


99 


1586 


AF118274 


Homo sapiens 


DNh-5 


2628 


97 


1587 


X79440 


Homo sapiens 


NADP+-dependent malic enzyme 


3167 


99 


1588 


X99802 


Homo sapiens 


ZYG homologue "> 


3966 


99 


1589 


AF169803 


Homo sapiens 


flavohemoprotein bS+bSR 


2563 


100 


1590 


"Y29861 


Homo sapiens 


Human secreted protein clone 
cb98_4 . 


181 


47 


1591 


Z25535 


Homo sapiens 


nuclear pore complex protein 
hnupl53 


7567 


99 


1592 


X13293 


Homo sapiens 


B-myb protein (AA 1-700) 


3678 


99 


1593 


M74027 


Homo sapiens 


mucin 


242 


27 


1*94 


AL139314 


Schizosaccha 
romyces 


hypothetical protein 


235 


54 
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TABLE 2 



PCT/US00734263 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






pombe 








1595 


W78324 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 81. 


1318 


98 


1596 


Y94906 


Homo sapiens 


Human secreted protein clone 
rb649 3 protein sequence SEQ 
ID NO: 18. 


2236 


98 


1597 


AF174605 


Homo sapiens 


F-box protein Fbx25 


1408 


99 


1598 


AB032254 


Homo 
sapiens 


bromodomain adjacent to zinc 
finger domain 2A 


9676 


98 


1599 


X73114 


Homo sapiens 


slow MyBP-C 


5568 


95 


1600 


X82200 


Homo sapiens 


gpStaf50 


2305 


100 


1601 


Y00876" 


Homo 
sapiens 


Human LAPH-1 protein 
sequence . 


1149 


98 


1602 


AJ223351 


Homo sapiens 


HIRA- interacting protein 3 


2821 


99 


1603 


AJ222801 


Homo sapiens 


neutral sphingomyelinase 


2268 


99 


1604 


AJ222801 


Homo sapiens 


neutral sphingomyelinase 


1601 


99 


.1605 


AF185576 


Mus mus cuius 


POZ/zinc finger transcription 
factor 0DA-8 


3435 


97 


1606 


AF093744 


Homo sapiens 


unknown 


131 


"lTTo 


1607 


A12142 


Bynthetic 
construct 


IFN -pseudo -omega 2 


800 


98 


1608 


Y57949 


Homo sapiens 


Human transmembrane protein 
HTMPN-73. 


1868 


100 


1609 


AF151044 


Homo sapiens 




6B1 


97 


1610 


X15218 


Homo sapiens 


ski protein <AA 1 - 728) 


37*S 


100 


1611 


Y08200 


Homo sapiens 


rab geranylgeranyl 
transferase 


2976 


100 


1612 


AF220560 


Homo sapiens 


B/K protein 


2486 


99 


1613 


AC004461 


Arabldopsis 
thaliana 


nodulin-like protein 


371 ~ " - 


™2l 


1614 


Y09S01 


Homo sapiens 


NADH-cytochrome-b5 reductase 


1607 


100 


1615 


Y15521 


Homo sapiens 


start position 1 


3150 




1616 


AJ010750 


Rattus 
norvegicus 


Castration Induced prostatic 
apoptosis related protein- 1, 
(CIPAR-1) 


~890 




1617 


X5B079 


Homo sapiens 


S100 alpha protein 


481 


100 


1618 


Y66-6-78 


Homo 
sapiens 


Membrane -bound protein 
PRO1009. 


967 


100 


1619 - 


AJ242973 


Homo sapiens 


peptide methionine sulfoxide r 
reductase 


929 


100 


1620 


AF150733 


Homo sapiens 


AD- 014 protein 


288 


"lbb" " 


1621 


AJO07509 


Homo sapiens 


ElB-55kDa-associatecL protein 


4646^ 


98 


1622 


X64177 


Homo sapiens 


metallothionein 


380 


100 


1623 


AE001045 


Archaeoglobu 
s fulgidus 


A. fulgidus predicted coding 
region AF0859 


240 


36 


1624 


AL355013 


Schizosaccha 

romyces 

pombe 


mitochondrial carrier protein 


403 


34 


1625 


Y66746 


Homo 
sapiens 


Membrane-bound protein 
PR01198. 


1184 


100 


1626 


D90053 


Sus scrofa 


destrin 


863 


100 


1627 


Y3S954 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
203. 


756 


100 


1628 


AL031775 


Homo sapiens 


dJ30M3.2 {novel protein) 


470 


100 


1629 


AF132484 


Mus mus cuius 


unknown 


286* 


68 


1630 


AF01709^ 


Drosophila 
melanogaster 


similar to C. elegans 
R10H10.6 and S. cerevisiae 
YD8419.03C 


4 93 


61 


1631 


X03077 


Homo sapiens 


lactate dehydrogenase -A 


1704 


100 


1632 


AF151084 


Homo sapiens 


HSPC2S0 


763 


100 


163'3 ■ 


AJ001874 


Homo sapiens 


orf" 


255 


97 


1634 


AC0121B7 


Arabidopsis 
thaliana 


Contains weak similarity to 
GATA-6 DNA- binding protein 
gb|H36135, gb|Z26200 come 
from this gene. 


143 


38 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


4 

IDENTITY 


1635 


AF026246 


Homo sapiens 


HERV-E integrase 


411 


90 


1635 


Y50943 


Homo sapiens 


Human adult brain cDNA clone 
ve8_l derived protein. 


1126 


95 


1637 


AF134593 


Homo sapiens 


L-pipecolic acid oxidase 


2068 


99 


1638 


AJ238247 


Mus musculus 


putative phosphatase subunit 


1948 


96 


1539 


Y94942 


Homo sapiens 


Human secreted protein clone 
yk25l_l protein sequence SEQ 
ID NOrSO. 


1320 


100 


1640 


AF235030 


Homo sapiens 


BM88 antigen 


766 


99 


1641 


AF2332BB 


Drosophila 
melanogaster 


WDS ' 


358 


26 


1642 


M19351 


Mus musculus 


immunoglobulin heavy chain 
binding protein 


145 


34 


1643 


Y70452 


Homo sapiens 


Human membrane channel 
protein- 2 (MECHP-2) . 


1352 


100 


1644 


AF176520 


Mus musculus 


WD repeat- containing F-box 
protein FBW5 


2(S7<i 


88 


1645 


W67816 


Homo sapiens 


Human secreted protein 
encoded by gene 10 clone 
HCEMU42 . 


1156 


100 


1646 


X67155 


Homo sapiens 


mitotic kinase-like protein- l 


4456 


99 


1647 


M63180 


Homo sapiens 


threonyl-tRNA synthetase 


1040 


61 


164A 


Y87342 


Homo sapiens 


Human signal peptide 
containing protein HSPP-119 
SEQ ID NO: 119. 


1566 


93 


1649 


R95332 


Homo sapiens 


Tumor necrosis factor 
receptor 1 death domain 
ligand (clone 3TW) . 


4137 


100 


1650 


AC00713 6 


Homo sapiens 


Putative map kinase 
interacting kinase 


656 


99 


1651 


AB015346 


Homo sapiens 


EpslSR 


4464 


99 


1652 


AL161576 


Arabidopsls 
thallana 


putative protein 


1341 


48 


1653 


AC005313 


Arabidopsis 
thaliana 


putative calmodulin 


288 


28 


1654 


AL031428 


Homo sapiens 


dJ184J9.1 (KIAA0601 protein) 


3526 


100 


1655 


AL031428 


Homo sapiens 


dJl84J9.1 (KIAA0601 protein) 


3526 


100 


1656 


AB017910 


Dictyosteliu 
m discoideum 


myoM 

r 


297 


32 


1657 


Y28919 


Homo 
sapiens 


Human regulatory protein 
HRGP-5. 


2251 


99 


1658 


AF056191 


Homo sapiens 


TPA inducible protein 


2744 


98 


1653 


U76846 


Arabidopsis 
thaliana 


ubiquitin- specific protease 


137 


35 


1660 


AL078627 


Schizosaccha 

romyces 

pombe 


actin-like protein; (2 actin 
domains) 


320 


34 


1662 


X52022 


Homo sapiens 


collagen type VI, alpha 3 
chain 


16274 


99 


1663 


AF30064B 


Homo 
sapiens 


guanine nucleotide binding 
protein beta subunit 4 


1811 


100 


16*6*4 


AF214736 


Homo sapiens 


EH domain containing protein 
2 


2774 


100 | 


1665 


Z48613 


saccharorayce 
s cerevisiae 


unknown 


138 


26" 


1666 


AF1773B5 


Homo 
sapiens 


cytochrome c oxidase assembly 
protein isoform 2 


1395 


99 


1667 


AC007842 


Homo sapiens 


BC331191_1 


1581 


47 


1666 


S67513 


Borna 
disease 
virus BDV, 
WT-1, Halle 
Bl/91, horse 
brain, field 
isolate, 
Peptide, 370 


p40 


397 


43 
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TABLE 2 





SEQ 
ID 
KO: 


Accession 

NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






aa 








1669 


Z99753 


Schizosaccha 

romyces 

pombe 


putatxve N0Ll-KQP2-sun family 
nucleolar protein 


569 


47 " 


1670 


G03130 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7211. 


427 


97 


1671 


M96625 


Gallus 
gallus 


cardiac muscle tensin 


1185 


54 


1672 


ATI 744 8 2 


Homo sapieno 


polycomb 3 


2005 


99 


1673 


Y51B46 . 


Homo sapiens 


Human 18.1 homolog protein 
fragment . 


233 


23 


1674 


AF255334 


Homo sapiens 


EXP35 


152 


29 


1575 


Y94B67 


Homo 
oapiena 


Human protein clone HP10563 . 


109 


30 


1676 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2. 


3043 


99 


1677 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2. 


1580 


91 


1678 


AF163151 


Homo sapiens 


dentin sialopnosphoprotein 
precursor 


170 


17 


1679 
1680 


AF163151 
AK024453 


Homo sapiens 
Homo sapiens 


dentin sialophosphoprotein 

precursor 

FLJ0OD45 protein 


170 


17 


16B1 


AF019236 


Dictyosteliu 
m diecoideum 


TipD 


1349 
613 


100 
34 


1682 




Leishmania 
major- 


proteophosphoglycan 


153 


26 


1683 
1684 


Z69369 
X94 910 


Schizosaccha 

romyces 

pombe 

Homo sapiens 


putative GTP- binding protein 
ERp2B 


560 


46 


1685 

16B6 
1687 


AF286475 

AF19129B 
ACT275986 


Takifugu 
rubripes 
Homo sapiens 
Homo sapiens 


retinitis pigmentosa GTPase 
regulator-like protein 
vacuolar sorting protein 35 
transcription factor 


1334 
196 

4087 


100 
19 

100 




1686 
1689 


X07311 


Homo sapiens 

Drosophila 

melanogascer 


transcription factor 
heat shock protein 


2958 
1886 
138 


100 

88 

43 


1690 
1691 


ACT27207B 


Rattus < 
norvegicus 
Homo sapiens 


iii Sl -interacting protein r 
NUDE1 

APOBEC-1 stimulating protein 


1383 
1256 


83 
68 




1692 
1693 

1694 


AJ272079 
AF177942 

AF263539 


Homo sapiens 

Xenopus 

laevis 

Homo sapiens 


APOBEC-l stimulating protein 
katanin p60 

arginine N-methyl trans f erase 


1336 " 
1664 


60 

64 




1695 
1696 


A^222^89 
AK000193 


Homo 
sapiens 
Homo sapiens 


protein arginine N- r 
methyltransferase 1-variant 2 
unnamed protein product 


1774 
1182 


100 
81 




1697 


AB041035 


Homo sapiens ■ 


kidney superoxide -producing 
NADPH oxidase 


1060 
3122 


100 
100 




1698 
1699 


AB041035 
AF025772 


Homo sapiens 
Homo sapiens 


kidney superoxide-producing 

NADPH oxidase 

C2H2 zinc finger protein 


21B1 


100 




1700 

1701 
1702 


Y44676 

AK022407 
AB024574 


Homo sapiens 

Homo sapiens 
Komo sapiens 


Human ARF-Related Protein-1 
(HARP-l) 

unnamed protein product 
GTP-binding like protein 2 


488 
938 

315 
1172 


54 
97 

98 
100 




1703 ■ 
1704 
1705 - 


uj ju t xy 
AF198092 
AE003573 


Homo sapi ens 
Kus musculus 
Drosophila 
melanogaster 


zinc finger protein 42 
RP42 

CG12474 gene product 


421 

1057 

161 


52 
77 
33 




1706 
1707 


AB036345 " 
Y55927 


Drosophila 
melanogaster 
Homo sapiens 


aquaporin 


164 


_ 24 




1708 

vm — 


D27121 

5391710 


Danio rerio 
Arabidopsis 


Human STLK2 protein. 

G12 

putative protein 


21461 

212 

505 


100 

47 

50 
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TABLE 2 





SEQ 
ID 

NO: 


NUMBER 


0 ft( — LtSi 


DESCRIPTION 


SMITH - 
WATBRMAN 
SCORE 


% 

IDENTITY 






thaliana 








1710 


" B01311 


nunu sapdcna 


nuiiumi t*KU^*±x poxVpepClQe » 


1649 


97 


1711 


U40750 


Mus rausculus 


formin binding probein 30 


4561 


B5 " 


1 71 5 
X t ±4. 




Mus mus cuius 


skeletal muscle and cardiac 


1490 


09 


1713 


AF255303 


. ttQmQ 

sapiens 


membrane-associated nucleic 
acid binding protein 


4416 


99 






Homo 

octpXeJlo 


membrane -associated nucleic 
acxQ uxnuxng procem 


2960 


ibo 


1715 


U08227 


norvegicus 


Kas-reiacea protein 


511 


51 


1 71 C 
X rlO 


»pi canoe 
nr Xb 0 / 33 


Rattus 
norvegicus 


schlaf en -4 


1129 


44 


1717 




Homo sapiens 


SUMO- l-speci tic protease 


5804 


99 


1718 


AL3 55737 


Homo sapiens 


HMG20A 


1762 


100 


1 "7 1 Q 
X / 1? 




Halocynthia 
roretzi 


HrPET-1 


1069 


46 


1720 


AF071317 


Mus rausculus 


C0P9 complex subunit 7b 


1297 


97 


1721 


AJ272215 


Homo sapiens 


HEYL protein 


1681 


99 


1722 


G01982 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6063 . 


718 


100 


1723 


AL032643 


Caenorhabdit 
is elegans 


similar to Uncharacterized 
protein family UPF0D34, 


825 


41 


1724 


G01972 


Homo sapiens 


Human secreted protein, SEQ 
ID NOi 6053. 


586- 


92 


1725 


Y94441 


Homo 
sapiens 


Human Adipose Specific 
Protein 1. 


1231 


100 


1726 


AF255443 


Homo sapiens 


CGI- 2 01 protein 


4397 


99 


1727 


AF183426 


Homo sapiens 


HT004 protein 


1810 


99 


1726 


D10884 


Bos taurus 


neurocalcin 


1002 


99 


1729 


Z18529 


Gallua 
gallus 


tensin 


1411 


84 


1730 


Z73423 


Caenorhabdit 
is elegans 


cDNA EST EMBL:Z14908 comes 
from this gene-cDNA EST this 
gene 


233 


41 


1732 


AF090891 


Homo sapiens 


PR00105 


470 


30 


1733 


AJ277724 


Homo sapiens 


hi stone deacetylase B 


2015 


100 




1734 


G04050 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8131. 


503 


95 




1735 


D45913 


Mus musculus 


leucine -rich- repeat protein 


3531 


94 ~~ 




1736'- 


AF096709 - 


Drosophila 
virilis 


failed axon connections 
protein 


276 


32 




1737 


AF195120 


Homo sapiens 


dynactin p62 subunit 


2417 


99 ~~ . 




1738 


L15314 


Caenorhabdit 
is elegans 


contains similarity to pfam 
family PF01772 N=l 


206 


37 




1739 


X54618 


Listeria 
monocytogene 

6 


phosphadidylinositol specific 
phoepholipase C 


134 


27 - 




1740 


AL031658 


Homo sapiens 


dJ310013.4 (novel protein 
similar to predicted c. 
elegans an C. intestinalis 
proteins) 


123 


31 




1741 


Y35924 


Homo sapiens 


Extended human accreted 
protein sequence, SEQ ID NO. 
173. 


1013 


99 — 




X 1 *± £. 




Arabidopsis 
thai iana 


F15H1B.15 


202 


32 




1743 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08 . 


1932 


59 




1744 


W75771 


Homo 1 
sapiens 


Human GTP binding protein 
APD08. 


1854 


61 




1745 


AF221098 


Homo 
sapiens 


Rai guanine nucleotide 
exchange factor RalGPSlA 


1224 


70 




1746 


Y99*72 


Homo sapiens 


Human PRO1430 (UNQ736) amino 
acid sequence SEQ ID NO: 116. 


1332 


99 




1747 


Y94294 


Homo sapiens 


Human coenzyme A-uti Using 


842 


100 
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TABLE 2 



PCT/USOO/34263 



SEQ 
ID 

NO: 

1748 


ACCESSION 
NUMBER 


~| SPECIES 


H DESCRIPTION 
enzyme CoAEN-2 . 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


1749 


AK02443* " 
AE000877 


Homo sapiens 
~ Methanobacte 
rium 

thermoautotr 
ophicum 


FLJ00026 protein 

conserved protein ' 


1619 
231 


100 
36 


1750 
1751 


AF101361 ■ 
Y15067 


Drosophila 
mBlanogaster 
Homo sapiens 


Abnormal X segregation 
2NF232 


193 
889 


JJ 
""100 


1752 
1753 

1754 


AF251038 
AC003093 

X69089 


Homo sapiens 
Homo sapiens 

Homo sapiens 


GAP -like protein 
OXYSTEROL- BINDING PROTEIN; 
45% similarity to P22059 
\P1D :gl29308 J 
165kD protein 


822 
"352 

5703 


" 100 " 
57 

99 ~ | 


1755 
1756 

1757 


AL049795 
AL031393 * 


Homo sapiens 
Homo sapiens 


cLr622L5.3 (novel protein) 

dJ73 3Dlb.l (Zinc-finger 

protein) 


" 1039 

" 2765 


100 


1758 
1759 


' AB040g72 

"AL022238 
AF117653 


Homo sapiens 

Homo sapiens 
Homo sapiens 


UDP-GalNAc: polypeptide 

acetylgalactosaminyl transf era 
se 

dai042Kl0.4 (novel protein) ' 
double homeobox protein 


2020 

776 


99 
43 


1760 " 
1761 

1762 


Y12u£5 
AL049712 


Homo sapiens 
Homo sapiens 


"HFJopsT " — " 

dJ686C3.2 (nucleolar protein 
hNop56) 


375 

2959 

2595 


54 

99 
99 




AC002394 


Homo ~ 
sapiens 


Gene product with similarity — 
to dynein beta subunit 


1542 


i>l 


1763 
1764 


AF1$9017 


Homo sapiens 


i'ormimino transf erase " 
cyclodeaminase 


877 


100 




U91541 


Homo sapiens 


human lormiminotransf erase 
cyclodeaminase (f ted) protein, 
carboxy- terminal end 


596 


100 


1765 
lite 


AB013365 
Y38421 


Baciilus ~~ 
haloduxans 


YlqF — 


350 


34 


1767 




Homo sapiens 


Human secreted protein 
encoded by gene No. 36. 


145 


71 


1768 

1769 j 

1770 j 


AC009176 

AKG00647 
AJ238902 
U73522 


Arabidcpsis 
thaliana 

Homo sapienB 
Homo sapiens " 
Homo sapienB 


putative ricuiose-1,5- 
bisphosphate 

carboxylase/oxygenase small «-. 
subunit N-methyltransferase I 
unnamed protein product 

VNN3 protein " 

AMSH " 


2l£ 

737 

2665 

1214 


"27 

99 
99 
"5* 


1771 
1772 
1773 
1774 

1775 


U89435 
SvOOll 
AL035086 
Y99426 

AF11033O 


Vl 1 Q fnnepii lti d 
I'aUS UlUoLUiUo 

Rattus ep. 
Homo sapiens 
Homo sapiens 

Homo sapiens 


unknown 

tricarboxylate carrier 
dJ44A20.2 (novel protein) 
Human PRO1604 (UNQ785) amino 
acid sequence SEQ ID NO:308. 
glutaminase 


829 
1604 
203* 
1057 

3146 


86 
95 
100 

99 

100 


1776 
1777 

1778 


AJ249529 
ZB1579 

AY007239 


Homo sapiens 
Caenorhabdit 
is elegans 
Homo sapiens 


glycerol 3 -phosphate permease 
cDNA EST yk75il.S comes from 
this gene 
monooxygenase X 


2787 
232 


Jl 


1779 
1780 


AL109608 
AF254260 


Schizosaccha 

romyces 

pombe 

Homo sapiens 


oxyszeroi -binding protein 
family 

tuxtelin 1 


187$ 
644 


99 * 

38 


1781 


L07924 


rius uiU9i.uj.us 


guanine nucleotide 
dissociation stimulator 


1729 
247 


100 — 
50 


1782 
1783 


AF295773 " 
AK024475 


Homo " " 

sapiens 

Homo sapiens 


ral guanine nucleotide 
dissociation stimulator 
FLJ0006B protein 


142 
4333 


49, 
100 


1784 

1785 ( 
178<J 1 


AX024475 
303933 

?82637 J 


Homo sapiens 
iomo sapiens 

: iomo~"sa^Ieni - 


FW00068 protein ~~ 

*uman secreted protein, SEQ 
rD NO: 8014. 

ig lambda-like gene/beta- ~ 


3996 
570 

247 : 


93 
100 

LOO 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








glucuronidase exon 11 homolog 







TRADOCS: 1 4 1 6280. 1(%CT40I !. DOC) 
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TABLE 3 



PCT/USOO/34263 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


2 


BL00240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 8.2*0e- 
12 157-181 


3 


PR00109 


TYROSINE KINASS 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 8.085e- 
13 358-381 


4 


BL00023 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.400e- 
10 1129-1146 BL0002B 
16.07 1.257e-09 820- 
837 


5 


BL00023 


Type II fibronectin 
collagen- binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


6 


HL00023 


Type II fibronectin 
collagen-binding domain 
proteins. 


BL00023 24.31 B.920e- 
.33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


7 


BL00023 


Type II fibronectin 
collagen- binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


8 


BL00023 


Type II fibronectin 
collagen -binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


9 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 S.li9e- 
09 863-917 


"16 


PR00464 


E-CLASS P450 GROUP II 
SIGNATURE 


PR00464D 17.40 6.182e- 
12 294-312 PR00464G 
12.41 4.231e-ll 377- 
393 


11 


PR00734 


GLYCOSYL HYDROLASE 
FAMILY 7 SIGNATURE 


PR00734I 11.46 4.296e- 
09 502-520 


12 


PF00023 


AnJc repeat proteins. 


PF00023B 14.20 6.500e- 
10 89-99 PF00023B 
14.20 2.636e-09 56-66 


14 


DMOO031 


IMMUNOGLOBULIN V REGION . 


DM00031B 15.41 3.848e- 
09 79-113 


15 


PR00208 


GLIADIN AKD LMW GLUTENIN 
SUPBRFAMILY SIGNATURE 


PR00208A 12.59 9.8&8e- 
10 517-535 PR00208A 
12.59 2.233e-09 520- 
538 


17 


PO00066 


PROTEIN ZINC- FINGER 
MBTAL-BINDI. 


PD00066 13.92 8.200e- 
14 282-295 PD00066 
13.92 9.400e-14 477- 
490 PD00066 13.92 
6.500e-13 505-518 
PD00066 13.92 9.500e- 
13 254-267 PD00066 
13.92 1.429e-12 393- 
406 PD00066 13.92 
6.S7ie-12 421-434 




BL00845 


CAP-Gly domain proteins. 


BL00B45 It*. 43 2.200e- 
25 55-80 


20 ■ ■ 


BL00487 


reductase proteins. 


ULfUUfta fa 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.9B4e-22 235- 
276 BL00487G 26.82 
4.082e-12 287-329 


21 


BLO0487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487B 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 348-390 


22 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BLQ0107A 18.39 3.250e- 
26 302-333 
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GEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


23 


BL0O1O7 


Protein Kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.250e- 
26 302-333 


25 


BLO0115 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins. 


BL00115T 8.45 7.273e- 
29 1208-1242 BL00115Q 
18.08 2.776e-21 953- 
983 BL00115Y 11.86 
8.000e-17 1604-1650 
BL00115M 19.19 8.130e- 
16 731-774 BL00115H 
14.34 9.392e-16 463- 
496 BLOOllSA 15.44 
7.414e-15 43-82 
BL00115R 6.50 6.12Be- 
14 983-1010 BL00115J 
16,71 9.289e-14 591- 
617 BL00115I 8.33 
4.336e-13 535-590 
BL00115L 12.25 5.939e- 
13 662-694 BL00115G 

IT <*C C la 1^ A1 e 

J.JL.OD D.U-.i.e-lJ 435- ' 
463 BLOOllSK 15.03 
3.417e-10 617-659 

10 863-913 BLOOllSP 
11.54 7.538e-10 913- 
953 BL00115S 18 .24 
7 96R*a-.in i m n»i n^*> 

BLOOllSU 10.34 4.475e- 
09 1242-1265 


26 


BL00420 


Speract receptor repeat 
proteins domain 
proteins. 


~BL00420A 20 42 4 inQg 

11 81-110 BL00420A 
20.4-2 B.B20f»-tn fld-li** 


27 


BL00050 


RiJbosomal protein L23 
proteins. 


BL00050A 23.71 9.250e- 
27 94-127 BLOO050B 
14.81 B,125e-12 133- 
147 


28 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR0092SB 3.73 3.089e- 
10 41-54 


29 " - 


PF00756 


Putative esterase. 


PFO0756C 14.12 1.108e- 
09 486-516 


32 


BL00557 


FMN- dependent alpha - 
hydroxy acid 
dehydrogenases proteins . 


BL00557D 17. 7£ 5.065e- ' 
37 274-316 BL00557A 
35.08 8.909e^29 24-73 
BL00557C 15.59 1.000c- 
28 227-257 BL005S7B 
21.27 B.B98e-22 130- 
169 


34 


PR00629 


SHC PHOSPHOTYROSINE 
INTERACTION DOMAIN 
SIGNATURE 


PRO0629F 9 90 «5 RflCn 

35 299-328 PR0062SF 
10.95 8.364e-32 334- 
361 PR00629B 13.66 
3.786e-27 224-247 
PR00629A 13.45 8.364e- 
21 206-222 PR00629C 
3.80 4.0D0e-12 249-261 
PR00629D 12.45 3.739e- 
11 276-286 


35 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
PD01270D 24.66 3.700e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


36 - 


PD01270 


KJSCEPTOR PC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 [ 
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SEO ID NO* 


NO. 












XU\iJL& fSJU . DO 0> /UUE" 

34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


37 


BL0O412 


Neuromodulin (GAP-43) 
proteins . 


BL00412C 10.28 9.24le- 
10 264-298 


38 


BL00412 


Neuromodulin (GAP- 43) 

(JiULCiUJ . 


BL00412C 10.28 9.241e- 

1 ft "> C A *i Q a 


39 


BL00412 


Neuromodulin (GAP- 43) 
proteins. 


BL00412C 10.28 9.241e- 
10 264-298 


40 


PR00380 


KINESIN HEAVY CHAIN 

C* T/^MJl *T»TTOCi 

fc> lOrJAIURfc. 


PR0038OB 12.64 7.366e- 
14 342-360 PR00380C 
13.18 6.927e-13 375- 
394 PR00380D 9.93 
2.180e-12 429-451 
PR00380A 14.18 5 . 154e- 
12 143-165 


44 


BL00345 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 239-290 BL00345A 
13.96 2.452e-14 204- 
223 


45 


BL0O345 


Ets -domain proteins. 


BL00345B 21.28 l.OOOe- 
40 215-266 BL00345A 
13.96 2.452e-14 180- 
199 


46 


DM01551 


Jew OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551A 15.63 3.53Be- 
26 172-202 DM015S1C 
14.62 3.571e-17 232- 
252 DK01551B 8.84 
4.750e-ll 214-226 


47 


PR0O876 


NEMATODE MfiTALLOTillONEIN 
SIGNATURE 


PR00876B 7.** 9.326e- 
11 246-260 


48 


PD01066 i 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 4.231e- 
33 6-45 


50 


BI.0O972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 

t- 


BL00972D 22.55 7 . 750e- 
19 994-1019 BL00972A 
11.93 7.120e-18 21S- 
234 BL00972E 20.72 
9.471e-14 1020-1042 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 B.269e-10 302-312 


51 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 990-1015 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1016-1038 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 B.269e-10 302-312 


52 


BL01115 


GTP- binding nualear 
protein ran proteins. 


BL01115A 10.22 3.063O- 
14 10-54 




PKQ090 0 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 8.500e- 
17 20-38 PR00988F 
12.23 7.828e-l5 196- 
210 PR00988C 13.64 
6.108e-14 104-120 

11 174-186 PR00988D 
5.95 6.878e-10 160-171 
PR00988B 11.60 2.915e- 
09 57-69 


55 


PR0O7cT2 


CHLORIDE CHANNEL 
SIGNATURE 


PR00762C 9.29 4.682e- 
21 294-314 PR00762D 
11.29 4,l03e-19 509- 
530 PR00762A 14.22 
9.333e-18 199-217 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PR00762F 15.12 3.100e- 
16 563-583 PR00762B 
12.12 6.063e-16 230- 
250 PR00762E 12.07 
2.286e-l5 545-562 

13 601-616 




BL00216 


Sugar transport 
proteins . 


BL00216B 27.64 8.800e- 
10 153-203 


58 


PF00791 


Domain present in ZO-1 
and. uncs-ilKe netnn 
receptors. 


PF00791B 28.49 2.049e- 
10 10B0-1135 


59 


PF00791 


Domain present in ZO-1 
and UncS-like nctrin 
receptors. 


PF00791B 28.49 2.049e- 
10 1062-1117 


61 


PD01929 


KINASE TYPE RESISTANCE 
ANTIBIOTIC TRANSFERASE 
AM. 


PD01929E 10.76 9.018e- 
09 206-221 


68 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 680-693 


69 


PRO 0360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 670-683 


70 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 8.714e- 
10 51-64 


*72 


DM001-?9 


v KINASE ALPHA ADHESION 
T-CBLL. 


DM00179 13.97 5.304e- 
09 108-118 


73 


BL00239 


Receptor tyrosine Icinase 
class II proteins. 


BL00239B 25.15 7.075e- 
12 118-166 


74 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 S.116e- 
10 93-120 


76 


DM00471 


0 PROKARYOTIC DNA 
TOPOISOMERASB I. 


DM00471A 11.73 9.357e- 
13 53-66 DM00471B 
8.45 4.857e-12 70-81 


80 


PD02876 


DECARBOXYLASE 

PHOS PHATIDYLSERINE . 


PD02876C 8.80 2.723e- 
13 223-236 PD02876D 
12.13 2.58Be-12 334- 
351 


81 


PD02876 


DECARBOXYLASE 

PHOS PHATIDYLSERINE . 

r 


PD02876C 8.80 2.723e- 
13 282-295 PD02876D 
12.13 2.5B8e-l2 393- 
410 


83 


BL00708 


Prolyl endopeptidase 
family serine proteins. 


BL00708B 24.91 7.197e- 
12 570-601 


84 


PRO 0014 


PIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014C 15.44 8.043e- 
09 985-1004 


86 


PR00678 


PI3 KINASE P85 
REGULATORY SUBUNIT 
SIGNATURE 


PR0067BH 9.13 1.379e- 
09 246-269 


89 


PRO 03 20 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 8.200e- 
09 264-279 PR00320B 
12.19 8.650e-09 264- 
279 


93 


BL00455 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 2.588e- 
14 316-332 


95 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.Q0Qe- 
10 123-154 


96 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 212-243 


cW 




LrijUi-Oiah/KJLBITOL 

DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.318e- 
13 134-146 PR00081A 
10.53 2.500e-12 54-72 


98 


.PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 5.500e- 
24 401-423 PR00380D 
9.93 7.188e-20 S13-63S 
PR00380B 12.64 7.517e- 
16 529-547 PR00380C 
13.18 2.756e-13 560- 
579 j 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


102 


PRO 03 00 


ATP -DEPENDENT CLP 
PROTEASE ATP -BINDING 
SUB UNIT SIGNATURE 


PR003 00A 9.56" 7.S45e- 
14 289-308 


104 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL004 79B 12.57 6.786e- " 
18 298-314 BL00479A 
19.86 4.913e-16 155- 
178 BL00479A 19.86 
4.300e-13 272-295 
BL00479B 12.57 6.294e- 
12 1B1-197 


106 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 8.013e- 
12 43-83 


107 


DM01970 


0 Jew 2K632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.6"D.5.000e- 
16 403-416 


108 


BL00191 


Cytochrome b5 family, 
heme -binding domain 
proteins , 


3L00191K 17.38 4.951e- 
27 23 8-282 BL00191J 
11.37 6.447e-17 182- 
204 


109 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.938e- 
37 8-47 


110 


BL01138 


Scorpion short toxins 
proteins . 


BL01138A 10.96 8.297e- 
10 38-50 


113 


"BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 S.BOOe- 
23 156-187 BL00107B 
13.31 9.100e-14 225- 
241 


117 


BL00214 


Cytosolic fatty-acid 
binding proteins. 


BL00214B 26.51 l.OOOe- 
17 4 6-91 BL00214A 
21.17 7.052e-ll 5-31 


118 


BLO0107 


Protein kinases atp- 
binding region proteins. 


BL00107A 18.39 B.S^Oe- 
13 36-67 


119 


PRO0S29 


GONADOTROPHS RELEASING 
HORMONE RECBPT0R 
SIGNATURE 


PR00529C 11.03 7.506e- 
10 1S8-177 


120 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


121 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400O- 
09 80-95 


127 


BL0021S 


Mitochondrial energy 
transfer proteins , 


BL00215A 15.82 7.158e- 
13 216-241 r 


128 


BL01032 


Protein phosphatase 2C 
proteins. 


BL01032C 6.14 3.195e- 
12 147-157 BL01032H 
11.25 5.680e-ll 318- 
331 BL01032G 8.33 
8.932e-ll 282-296 
BL01032I 10.42 8 . 902e- 
09 379-3B9 


129 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL01310 14.74 6.594e- 
26 28-64 


130 


PR00990 


R I BO KINASE SIGNATURE 


PR00990B 12.32 9.534e- 
15 47-67 PR00990A 
16.23 5.500e-l4 20-42 
PR00990C 12.62 2.412e- 
09 119-133 


133 


BL00380 


Acyl - CoA-binding 
protein. 


BL00880 17.52 5.575e- 
26 72-122 


134 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 9.30Be- 
14 18-37 


135 


PR00215 


NEUROMODULIN SIGNATURE 


PR00215C 13-98 6.779e- 
10 475-496 


136 


BLO1310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BLOi310 14.74 5.432e- 
29 71-107 


140 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL0002B 16.07 7.8B2e- 
14 214-231 BL00028 
16.07 9.4716-14 102- 
119 BL00028 16.07 
2.800e-13 1B-3S 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00O28 1^.07 5.*00e- 
13 74-91 BL00028 

16 07 C) 1QOi»-Tl iflc. 

Jfc a . U / 7 . XVVK 1J 1BD" 

203 BL00028 16.07 
8.043e-12 46-63 
BL0002 8 16 07 R <J"*S^- 
12 130-147 BL00028 
16.07 9.217e-12 270- 
287 BLO0O28 16.07 
6.192e-ll 242-259 
BL00028 16.07 4.000e- 
10 1S8-175 


141 


BL00501 


Signal peptidases I 
serine proteins. 


BL00501D 16.69 9.538e- 
14 113-133 BL00501C 
9.61 8.6B8e-10 89-101 


143 


BL01020 


SARI family proteins. 


BL01020C 15.35 7.722e- 
20 79-130 


146 


PD01066 


PROTEIN ZINC FINGER 
2 INC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.400e- 
25 335-374 


149 


BL00126 


31 51 -cyclic nucleotide 
phosphodiesterases 
proteins . 


BL00126C 22.07 1.4S0e- 
25 509-550 BL00126E 
35.22 3 . 951e-16 654- 
709 BL00126D 25.50 
1.360e-15 565-604 
BL00126B 15.20 8.200e- 
11 483-495 BL00126A 
27.56 a.269e-ll 442- 
479 


"151 


BLb0^32 


Ribosomal protein S4 
proteins . 


BL00632 23.79 5.271e- 
20 106-149 


154 


BL00559 


Eukaryotic molybdopterin 

axidoreductases 

proteins. 


BL005591 13. *3 5.304e- 
19 29-58 BL00559K 
13.17 2.957e-18 172- 
199 BL00SS9J 19.63 
8.385e-13 99-151 
BL00559L 13.60 5.814e- 
12 241-259 


155 


PRO 04 4 9 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.692e- 
13 13-35 




oXiU040b 


Actins proteins. 


BL00406D 12. 5B 2.547e- 
18 275-330 BL00406A 
9.9S 5.776e-l6 15-50 
BL*004UbB 5.47 7.429e- 
12 69-124 BL00406C 
o./o y.DD^e-i^ 128-183 


160 


BL00132 


Zinc carboxypeptidases, 
zinc -binding region 1 
proteins . 


BL00132A 26.07 7.000a-" 
14 22-63 BL00132C 
21.35 3.466e-12 104- 
145 


"TS5 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
STGNATT7RP 


PR00109B 12.27 9.043e- 
13 139-158 


168 


BL0O362 


Ribosomal protein si 5 


BL00362 24.67 9.700e- 


169 


BL0O039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 1.000c- 
35 640-686 BL00039A 
18.44 1.964e-13 212- . 
251 BL00039B 19.19 
4.553e-13 37B-404 
BL00039C 15.63 8.773e- 
12 465-489 


175 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.72le- 
12 14-36 


178 


BL01310 


ATP1G1 / PLM / MATS 
family proteins. 


BL01310 14.74 2.432e- 
29 133-169 


179 


PDQ1066 


PROTEitf 2^C FlKfOtiR 
ZINC- FINGER METAL- 


Pb6ld66 19.43 9.455e- 
36 6-45 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






BINDING NU. 




180 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 7.429e- 
20 160-180 PR00007A 
19.33 4.938e-l9 133- 
160 PR00007C 15.60 
1.225e-15 2Q6-22B 
PR00007D 9.64 6.885e- 
11 238-249 


181 


BL00027 ' 


'Homeobox' domain 
proteins . 


BL00027 25.43 9.526e- " 
24 280-323 


182 


BL00027 


•Homeobox' domain 
proteins. 


BL00027 26.43 9.526e- 
24 263-305 


183 


BL00027 


'Honeobox' domain 
proteins. 


BL00027 26". 43 9.526e- 
24 280-323 


184 


"BL0002 i > 


'Homeobox' domain 
proteins. 


BL00027 26.43 9.S26e- 
24 263-305 


18B 


PR00929 


AT-HOOK-LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.3 28e- 
09 460-471 


189 


PR00929 


AT-HOOK-LIKE DOKAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 440-451 


190 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins. 


BL00383F 15.51 7.1B8e- 
17 666-682 BL00383A 
13.34 8.714e-17 162- 
177 BL00383E 10.35 

I. 000e-14 333-344 
BLO0383E 10.35 7.300e- 
14 628-639 BL003B3F 
15.51 1.720e-13 371- 

3 87 BL00383C 10 . 10 
3.000e-13 217-228 
BL00383D 11.92 7.000e- 

"1 ^ 7QQ_^no qt nmoin 
J.J zjd-JUo dXjU0Jo3o 

7.61 1.692e-ll 187-196 

09 509-520 BL00383D 

II. 92 4 . 000e-09 589- 
602 QL00393B 7 <C1 

8 .000e-09 479-488 


191 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 7.911e- 
15 B3-105 PR00450C 
12.22 6.2866*13 47-69 


193 


PFOoSeU 


Octicosapeptide repeat 
proteins. 


PF00564B 24.74 6.164e- 
16 227-278 


194 


PRO0503 


BROMODOMAIN SIGNATURE 


15 204-224 PR00503B 
9.96 9.571e-13 170-187 


195 


BLO0901 


cysteine 

synthase/cystathionine 
beta-synthase P- 
phoephate att. 


BL00901C 20.63 3.429e- 
18 67-117 


197 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 6.211e- 
17 40-57 BL00636B 
15.11 2.000e-13 67-88 


198 


PR00690 


ADHESIN FAMILY SIGNATURE* 


PR00690A 10.86 9.866e- 
09 463-482 


199 


BL01131 


Riboaoraal RNA adenine 
dimethylases proteins. 


BL01131A 26.62 2.343e- 
12 84-130 


201 " 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.352e- 
12 509-522 


203 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.286e- 
10 39-72 


206 


PR00261 


IiOW DENSITY LIPOPROTEIN 
(hDL) RECEPTOR SIGNATURE 


PR00261A 11.02 4.462e- " 
19 65-87 PR002S1C 
11.37 9.308e-19 65-87 
PR00261D 12.47 2.667e- 
18 65-87 PR00261B 
14.12 4.000e-18 143- 
165 PR00261A 11.02 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








4.833e-18 143-165 
PR00261D 12.47 7.500e- 
18 143-165 PR00261B 
±*.xt 3.Ut>De*lo 65-87 
PR00261C 11.37 8.967e- 
16 143-165 PR00261F 
11.57 4.938e-13 143- 

7.188e-13 65-87 
PR00261F 11.57 7.18Be- 
13 65-87 PR00261E 
11.08 1.643e-ll 143- 
165 


209 


PF00791 


Domain present in zo-1 
and Unc5-like n«t-*-i n 
receptors . 


PF00791B ifi.49 £.143e- " 
13 118-173 PF00791C 
20.98 7.680e-10 132- 
l/l 


211 


PROO007 " 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007A 19.33 5.781e- 
19 131-158 PR00007B 
14.16 4.115e-18 158- 
178 PR00007C 15.60 
1.67Se-15 201-223 
PR00007D 9.64 7.231e- 
11 233-244 


212 


BLO0183 


Ubiqu i t in - conj uga t ing 

ftTl 7VTT1P R nmhp{ no 


BL001B3 28.97 1.545e- " 
. 30 43-91 


213 


BL00163 


Ubiqui t in- con} uga t ing 
enzymes proteins. 


BL00183 28.97 1.545e- 
30 43-91 


215 


BL0003£ 


j*fCft#u/*j3ujt eiiDrainxjLy aip- 
dependent helicases 


BL00039D 21.67 1.900e- 
29 568-614 BL00039A 
18.44 l.B71e-23 21-60 
BL00039C 15.63 1.720e- 
11 364-388 BL00039B 
19.19 4.064e-ll 277- 

JUJ 


217 


BLOO100 


Chi o r amph en i c o 1 
acetyl transferase 
proteins. 


BL00100D 17.22 8.484e- 


219 


PR00213 


MYELIN P0 PROTEIN 
SIGNATURE 


PR00213C 15.94 3.969e- 
ii iyy-^27 


222 


BL00678 


Trp-Asp (WD.) repeat 
proteins proteins. 


BL00678 9.^7 1.947e-09 
144-155 


224 


PR0087S 


MOLLUSC MKTALLOTHIONEIN 
SIGNATURE 


PR00875A 5.83 l.OOOe- 
09 901-913 


225 ■ 


BI*0tf3* 


Nt-dnaJ domain proteins. 


BL00636B 15.11 B.20Oe- 
19 1B-39 


226 ■■' 


BL00636 


Nt-dnatT domain proteins. 


BL0063*A 8.07 1.000c- " 
21 21-38 BL00636B 
15.11 8.200e-19 45-66 


229 
"230 


PR0O301 


70 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00301F 13.98 7.563e- 
13 329-346 PR0D301G 
13.78 4.300e-12 361- 
382 




BLOO^O 


Glutathione peroxidases 
eelenocysteine proteins . 


BL00460A 28.67 8.773e- 
20 35-70 BL00460B 
9.73 7.429e-16 78-96 

12 111-134 BL00460D 
16.89 8.773e-ll 140- 
160 


231 
"233 


PR00647 


SENR ORPHAN RECEPTOR 
SIGNATURE 


PR00647B 10.19 8.522e- 
09 273-287 




BL00292 


cyciins proteins. 


BL00292B 20.31 7.429e- " 
27 244-275 BL00292A 
22.87 7.750e-27 201- 
235 


234 J 


PR00449 

3 


rRANSFORMING PROTEIN P21 
IAS SIGNATURE 


PR00449A 13.20 6.308e- 
L3 7-29 PR00449C 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 










17.27 4.4£2e-ll 47-70 
PR00449D 10.79 7.120e- 
11 109-123 


235 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


10 251-265 PR00019B 

11 36 5 370^-09 n o_ 
133 PR00019B 11.36 
1.000e-08 229-243 


23€i 


PR00019 


LEUCINE -Rick REPEAT 
SIGNATURE 


PR00019B 11.36 7.300c- 
10 245-259 PRonm 
11.36 5.320e-09 113- 
127 PR00019B 11.36 
l.O00e-08 223-237 


237 


PD00289 


PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


onnnoBQ o ot o aaoa r»o « 

fuuu^oj 9.9/ o.44oe — 09 

67-81 


240 


FRO 0011 


TYPE III EGF-LIKE 
SIGNATURE 


rKUUUliU 14.03 2.492e- 
10 616-63S 


241 


PRO 0011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-635 


244 


BL009Q3 


Cytidine and 
deoxycytidylate 
deaminases zinc -binding 
region s. 


BL00903 12.93 6.941e- 
12 54-64 


245 


DM00179 


w KINASE ALPHA ADHESION 


DMO0179 13.97 8.043e- 
09 124-134 


248 


BL00246 


Wnt-1 family proteins. 


BL00246D 23.97 l.OOOe- 
40 186-239 BL00246E 
20.32 1.000e-40 305- 
351 BL00246B 13.69 
4.176e-36 10S-140 
JsjjUuZ4oA 15.75 2.286e~ 
24 70-90 BL00246C 
Ib.bo 4.H57e-22 150- 

175 


250 


PR00927 


ADENINE NUCLEOTIDE 
TRANSLOCATOR 1 SIGNATURE 


PR00927E 14.93 5.114e- 
10 253-275 


"254 ; 


BL00674 


AAA-protein family 
proteins . 


BL00674B 4.46 l.OOOe- 
09 223-245 


255 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD0179* 15.01 t^.04Se- 
09 61-88 


255 


BL50002 


Src homology 3 (SH3) 
domain proteins profile . 


BL50002B 15.18 2.B00e- 
10 421-435 


25B 


PR00O94 ' 


ADENYLATE KINASE 


PR00094C 12.94 2.200e- 
18 87-104 PR00094D 
12.52 2.731e-14 161- 
177 PR00094A 10.31 
5.500e-14 11-25 
PR00094B 11.01 4.115e- 
13 39-54 PR00094E 
11.25 7.333e-13 178- 


259 


BLO0892 


HIT family proteins. 


BL00892A 18.17 5.500e- 
13 60-91 


262 < 


BL003 8 8 


Proteasome A- type 
Gubunita proteins. 


BL00388A 23.14 l.OOOe- 
*w o-d^ DLi\JV Jo on 
31.38 3.864e-33 66-108 

nT.nn^ftfln on *7i 1 aaas 
duuujoou £\j . fx l.uuue- 

21 153-184 BL00388C 

18.79 8.147e-16 126- 

14 8 


2*4 


BL00903 


Cytidine and 
deoxycytidylate 
deaminases zinc -binding 
region s. 


BL00903 12.93 5.821e- 
09 91-101 


267 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- 
09 241-257 


2lo 


BL0022I* 


Intermediate filaments 
proteins. 


BL00226D 19.10 l.OOOe- 
37 362-409 BL00226B 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 








23 .86 8 . 043e-35 196- 
244 BL00226C 13.23 
7.000e-20 261-292 
dL»UJZ«soA 12.77 6.143e- 
15 96-111 


271 


PD02952 


CHOLINE PROTEIN 


PU029b2(_ 15.76 9.731e- 
-6 235-265 PD02952B 

229 


272 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 l.OOOe- 
40 106-160 PD02929B 
18.36 B.B00e-17 179- 
199 


"2 74 


BL01O97 


Glycosyl hydrolases 
family 39 proteins. 


BL01027B 15.34 3.486e- 
09 213-250 


275 


PRO 04 24 


ADENOSINE RECEPTOR 
SIGNATURE 


PR00424D 14.32 6.451e- 
11 39-59 


277 


BL00052 


Ribosomal protein S7 
proteins. 


BL00052A 27.85" fi.OOOe- 
13 137-184 BL00052B 
15.17 S.143e-12 208- 
235 


279 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 5.659e- 
13 267-294 


280 


PR00319 


BETA G- PROTEIN 
(TRANSDOCIN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 l.OOOe-21 89-105 
PR00319A 15.27 8.3 64e- 
21 51-68 PR00319B 
11.47 8.200e-19 70-85 


281 


PR00319 


BETA G- PROTEIN 
(TRANS DUCIN) SIGNATURE 

• 


PR00319D 11.64 6.625e- 
23 94-112 PR00319C 
13.41 1.000e-21 76-92 
PR00319A 15.27 8.364e- 
21 38-55 PR00319B 
11.47 8.200e-19 57-72 


287 


PF00929 


Exonuclease. 


PF00929D 16.17 7.366e- 
09 149-163 


291 


BL00326 


Tropomyosins proteins. 


BLO0326A 14 . 01 ' 2 . 36^- 
09 93-127 


292 


BL00326 


Tropomyosins oroteine. 


BL00326A 14.01 p.3 6 0e- 
09 93-127 


294 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 8.714e- 
12 203-216 


295 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16 .07 5.500e- 
15 322-339 BL00028 
16.07 9.4 71e-14 433- 
450 BL00028 16.07 
4.600e-13 648-665 
BL00028 16.07 5.500e- 
13 760-777 BL00028 
16.07 9.550e-13 788- 
805 BL00028 16.07 
3.348e-12 704-721 
BL00028 IS .07 6.478e- 
12 461-478 BL00028 
16.07 8.435e-12 844- 
861 BL00028 16.07 
1.692e-ll 593-610 
bl»U0U^a 16.07 2.038e- 
11 211-228 BL00028 
16.07 S.154e-ll 732- 
749 BL00028 16.07 
5.846e-ll 377-394 
BL00028 16.07 6.885e- 
11 816-833 BL00028 
16.07 7.231e-ll 676- 
693 BL00028 16.07 
9.654e-ll 564-581 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL0OO28 16.07 4.086e- 
09 517-534 BL00028 
16.07 7.429e-09 489- 
506 


296 


BL00215 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 3.333e- 
16 111-136 BL00215A 
15.82 2.723e-ll 10-35 
BL00215B 10.44 9.526e- 
11 152-165 BL00215B \ 
10.44 7.375e-10 59-72 
BL00215A 15.82 9.824e- 
10 205-230 


302 


PPO0953 


Glycosyl transferase. 


PF00953C 19.70 8.773e- 
34 236-269 PFO0953A 

129 PF00953B 6.17 
1.000e-13 182-194 


3 04 


PF00152 


tRNA synthetases class 
II. 


28 422-461 PF00152C 

257 PF00152B 15.67 
2 658e-13 159-184 
PF00152A 19.68 5.714e- 
11 44-67 


305 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 8.250e- 
35 37-76 


306 


PD02784 


PROTEIN NUCLEAR 
R IBONUCLEOPROTEIN . 


PD02784B 26.46 5.840e- 
09 92-135 


307 


PR004S"4 


~ETS DOMAIN SIGNATURE 


"-PR60454C! 11.24 7.80Be- 
09 1167-1186 


308 


PR00237 


RHODOPSIN-LIKE GPCR 


PR00237E 13.03 5.091e- 
13 188-212 PR00237G 
19.63 7.207e-13 268- 
295 PR00237A 11.48 
4.375e-ll 24-49 
PR00237C 15.69 3.057e- 
10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e- 
10 230-255 PR0D237B 
J.J.OU 3.4Joe-20 57-79 


309 


BL00522 


DNA -polymerase family x 
proteins - 


BL00522C 11.90 7.S77e- 

14.90 1.310e-15 470- 
494 BL00522A 25.52 
1.265e-14 179-226 
BL00522E~19 6"i ft ci c P . 
14 430-460 BL0052.2B 
27.30 9.625e-12 267- 
313 


310 


BL0032£ 


Tropomyosins proteins . 


BL00326D 8.76 5.235e- 
10 856-897 


312 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 4.706e- 
14 151-174 BL00290B 
13.17 9.000e-12 211- 
229 


313 


BL00345" 


Ets- domain proteins. 


BL00345B 21.28 l.OOOe- 
40 34-85 BL00345A 
13.96 9.217e-16 1-20 


315 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 5.091e- 
15 63-76 


317 ■-' 


BL01020 


SARI family proteins. 


BL01020C 15.35 3.198e- 
17 79-130 


318 


BL0O216 


Sugar transport 
proteins. 


BL00216B 27.64 4.696e- 
11 164-214 


320 


PR0O1O9 


TYROSINE KINASE 
CATALYTIC DOMAIN 


JPK00109B 12.27 4.814e- 
10 216-235 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 




321 


BL00027 


' Homeobox 1 doma i n 
proteins . 


BL00027 26.43 5.688e- 
10 329-372 


322 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


?R00109B 12.27 8.765e- 
12 558-577 \ 


324 


BL01241 


Link domain proteins. 


BL01241 35.81 8.313e- 
30 163-236 BL01241 
35.81 3.222C-13 282- 
335 


326 


BL0 0412 


NeuromoauJLin \GAr-4.j; 
proteins. 


12 515-566 BL00412D 
16.54 5.705e-ll 516- 
567 BL00412D 16.54 
7.8486-10 518-569 
BL00412D 16.54 l.B27e- 
09 514-565 BL00412D 
16.54 1.918e-09 513- 
564 BL00412D 16.54 
2.102e-09 520-571 


32B 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.5S7e- 
20 151-199 BL00232B 
32.79 2.246e-18 41-89 
BL00232B 32.79 5.985e- 
18 370-418 BL00232B 
32.79 5.500e-16 258- 
306 BL00232B 32.79 
9.3B4e-lS 475-523 
BL00232C 10.65 2.537e- 
12 256-274 BL00232C 
10.55 4.326e-ll 366- 
386 BL00232C 10.65 
7.261e-ll 473-491 
BL00232C 10.65 7.4S7e- 
11 39-57 


330 


PR00454 


UTS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


331 


BL00598 


Chromo domain proteins. 


BL00598 14.45 8.393e- 
18 27-49 


333 


BL01016 

r 

• 


Glycopro tease family 
proteins . 


BL01016C 22.84 3.925e- 
32 70-115 BL01016E r 
14.88 5.286e-19 149- 
177 BL01016H 13.71 
7.577e-13 291-301 
BL01016D 8.86 3. 29 Be- 
ll 127-140 BL01016G 
7.14 S.622e-10 261-271 
BL01016A 5 . SS 7.167e- 
10 4-19 BL01016P 

212 BL01016B 8.93 
8.855e-09 38-50 


33 9 


DJjU X X iJ 


uir Dinuxn^ nuclear 
protein ran proteins . 


nT.ni i 1 in ~y> t: cnn 0 . 

£>UU 1Ui£6 j.Duue — 

11 17-61 


340 


PQOlOfifi 

* «/U IVDD 


ZINC- FINGER METAL- 
BINDING NU. 


PnOlOfifl 19 Al T Tlla. 
ruuiuoo ij . * j i.ijic' 

33 10-49 


"341 ~ ' 




i\j.neain iiynL cnain 
repeat proteins . 


09 55-109 


342 


PO01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.400e- 
30 16-55 


343 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 l.OOOe- 
40 20-68 


346 


PRO 01 09 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.764e- 
11 135-154 


347 


PR00109 


TYROSINE KINASE 


PR00109B 12.27 4.764e- 
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SEQ ID NO: 


ACCESSION^ 
NO. 


DESCRIPTION 


RESULTS* 






CATALYTIC DOMAIN 
SIGNATURE 


11 135-154 


351 


BL01187 


Calcium- binding EGF-like 
domain proteins pattern 
proteins . 


BL01187B 12.04 1.783e- 
13 100-116 BL01187B 
12.04 8.435e-13 276- 
292 BL01187B 12.04 
8.800e-ll 13-29 
BL01187B 12.04 7.429e- 
10 54-70 BL01187B 
12.04 5.725e-09 231- 
247 BL01187A 9.98 
7.000e-09 25S-267 


■"352 


"PD00078 


REPEAT PR6TETNANK 

NUCLEAR ANKYR. 


PD00078B 13.14 5.950e- " 
10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


354 


BL00380 


Rhodanese proteins. 


BL00380F 9.7* 6.694e- 

11 542-553 i 


355 


PFO0628 


PHD- finger. 


PF00628 15.84 l.OOOe- 
11 116-131 


356 


PR00587 


SOMATOSTATIN RECEPTOR 
TYPE 1 SIGNATURE 


PR00587A 8.06 9.700e- 
09 17-37 


359 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00O66 13.92 4.4S2e- 
15 261-274 PD000SS 

246 PD00066 13.92 
4.300e-09 289-302 


361 


PF0O791 


Domain present in ZO-i 
and UncS-like netrin 
receptors . 


jrruu/:jjLO £o •1.) j . o Uie 

13 54-109 PF00791B 
28.49 1.095e-12 21-76 
PF00791A 27.85 1.432e- 

28.49 7.440e-09 184- 
239 


362 


PF00791 


Domain present in ZO-1 
and UncS-like netrin 
receptors . 


PF00791B 28 49 2 ?7^- 
11 279-334 


3*3 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 5.060e- " 
10 73-95 PR00450C 
12.22 3.27Be-09 109- 
131 


364 


PP00242 


DNA polymerase (viral) 
N- terminal domain 
proteins . 


PF00242O 13.51 2.328e- " 
09 22-68 


365 


P*66242 


DNA polymerase (viral) 
N- terminal domain 
proteins . 


PF00242O 13 51 2 32Be- 
09 22-68 


366 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 103B-1092 


367 " ■ 


PRO 0019 


L3UCINE-RICH REPEAT 

SIGNATURE 


PR00019B 11.3* 1.360e- " 
09 22 9-243 PR00019B 
11.36 6.040e-09 91-105 
PR00019A 11.19 8.667e- 
09 370-384 


368 


PRO 0011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllD 14.03 9.000e- 
15 30-49 PROOOllA 
14.06 9.830e-15 30-49 
PROOOllB 13.08 4.500e- 
14 30-49 PROOOllC 
24.25 5.143e-09 6-35 


369 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032H 11.25 4.150e- 
09 417-430 


372 


BL0O478 


LIM domain proteins. 


BL00478B 14.79 7.750e- 
12 410-425 


-3^3 -- - 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL - 
BINDING NU. 


PD01066 19.43 9.757e- 
34 26-65 




SODIUM CHANNEL SIGNATURE 


PR00170E *.4B 2.739e- 
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SEQ ID WO: 


ACCESSION 
NO. 


DESCRIPTION 


REStMUtt* 








10 88-118 


380 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 l.OOOe- 
23 276-307 BL00107B 
13.31 1.692e-12 342- 
358 


381 


BL00455 


Putative AMP- binding 
domain proteins. 


BL00455 13.31 5.714e- 
12 50-66 


382 


PR00624 


HIS TONE H5 SIGNATURE 


PR00624G 4.08 4.900e- 
09 524-544 


384 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.9S0e- 
10 366-379 PD0007BB 
13.14 4.522e-09 168- 
181 


385 


FR00511 


TEKTIN SIGNATURE 


PR00511D 7.11 5.371e- 
09 67-80 


386 


PD02870 


RECEPTOR INTRRLEUKIN-1 
PRECURSOR . 


PD02870B 18.83 6\000e- " 
10 97-130 


388 


PDQ0066 


PROTEIN 2 INC- FINGER 
METAL -BIND I . 


PD00066 13.92 S.OOOe- 
13 516-529 


389 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 7.667e- 
09 151-174 


390 


BL00215 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 5.200e- 
15 221-246 BL00215A 
15.82 7.618e-14 20-45 
BLO0215A 15.82 8.85le- 
11 123-148 BL00215B 
10.44 9.526e-ll 69-82 
BL00215B 10.44 7.300e- 
09 272-285 BL00215B 
10.44 8.500e-09 165- 
178 


394 


BL00674 


AAA-protein family 
proteins. 


BL00674B 4.46 2.723e- 
16 299-321 


397 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 8.579e- 
11 141-1S5 


398 


PRO 07 61 


BlNDlN PRECURSOR 
SIGNATURE 


PR00761B 9.93 6.764e- 
09 55-74 


399 


BL00240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 7.907e- 
10 118-142 


401 


PF00676 


Dehydrogenase El 
component . 


PF00676B 7.4.71 0 . 071e- 
18 331-369 PF00676D 
14.40 3.854e-15 486- 
506 PF00676C 16.88 
9.182e-14 454-478 


402 


BL00514 


Fibrinogen beta and 
gamma chains C- terminal 
domain proteins. 


BL00514C 17.41 4.673e- 
28 4432-4469 BL00514G 
15.98 6.092e-14 4555- 
4585 BL00514D 15.35 
2.532Q-12 4473-4486 
BL00514F 11.65 4.288e- 
10 4519-4534 BL00514H 
14.95 4.955e-10 4584- 
4609 


403 


PF00992 


Troponin. 


PF00992A 16.67 S.974e- 
09 105-140 


404 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.450b- 
10 73-87 PR00019A 
11.19 8.043e-10 76-90 
PR00019B 11.36 l.OOOe- 
09 50-64 PR00019B 
11.36 1.000e-09 96-110 


405 


BL00232 


Cadnerins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.S57e- " 
20 139-187 BL00232B 
32.79 2.246e-18 29-77 
BL00232B 32.79 5.985e- 
18 358-406 BL00232B 
32.79 5.500e-16 246- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








294 BL00232B 32.79 
9.3B4e-15 463-511 
BL00232C 10.65 2.537e- 
12 244-262 BL00232C 
10.65 4 .326e-ll 356- 
374 BL00232C 10.65 
7.26le-ll 461-479 
BL00232C 10.65 7.457e- 
11 27-45 


407 


PFO0426 


Outer Capsid protein VP4 
(Hemagglutinin) . 


PF00426S 15.67 5.634e- 
09 902-940 


403 


BLO1160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.695e- 
09 126-180 


410 


BL00741 


Guanine - nucleot ide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 2.731e- 
09 252-275 


411 


PF00646 


F-box domain proteins. 


PF00646A 14.37 6.344e- 
09 86-100 


412 


BLOO603 


Thymidine kinase 
cellular- type proteins. 


BL00603B 11.39 8.500e- 
09 542-557 


415 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins. 


BL00866B 36.29 3.571e- 
31 245-291 BL00866C 
23.26 9.000e-25 331- 
366 


418 


PR00239 


MOLtUSCAW RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 6.114e- 
09 590-602 


421 


PF00791 


Domain present in 20-1 
and Unc5-like netrin 
receptors . 

r 


PF00791B 28.49 7.955e- 
14 23-78 PF00791B 
28.49 3.653e-12 273- 
328 PF00791B 2B.49 
4.273e-ll 156-211 
PF00791B 28.49 7.8lBe- 
11 89-144 PF00791B 
28.49 1.524e-10 56-111 

09 37-76 PF00791C 

«v • 3 itJJC U J x ( u 

209 PP00791C 20.98 
5.235e-09 381-420 
PF00791B 28.49 6'. 202e- 
09 189-244 PP00791B' 
28.49 7.028e-09 435- 
490 PF00791B 28.49 
8.679e-09 367-422 


424 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 7.207e- 
28 1545-1679 


425 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 5.881e- 
10 228-251 


429 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.600e- 
11 31-40 


431 


BL00039 


DEAD -box subfamily ATP- 
dependent he li cases 
proteins . 


BL00039D 21.67 1.844e- 
34 490-536 BL00039A 
18.44 5.615e-19 205- 
244 BL00039B 19.19 
8.920e-l6 251-277 
BL00039C 15.63 5.78le- 
15 333-357 


432 


PRO 04 52 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 7.652e- 
12 169-185 


433 


PR00828 


FORMIN SIGNATURE 


PR0OB28B 5.23 8 ,218e- 
10 382-405 


436 


BL00415 


Synapsins proteins. 


BL00415N 4.29 8.643e- 
11 195-239 BL00415N 
4.29 3.036e-09 809-853 


443 


PR00834 


HTRA/DEGQ PROTEASE 
FAMILY SIGNATURE 


PR00834F 10.91 6.040e- 
11 221-234 


446 


PF01140 1 


Matrix protein (MA) , 


PF01140D 15.54 9.663e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






P15. 


10 183-218 PF01140D 
15.54 3.093e-09 246- 
281 


449 


PRO 05 6 8 


DOPAMINE D3 RECEPTOR 
SIGNATURE 


PR00568G 13.95 5.55le- 
09 39-53 


451 


PP00084 


Sushi domain proteins 
(SCR repeat proteins. 


PF0OO84B 9.45 3.8l3e- 
10 47-59 


452 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.82le- 
09 616-649 


"~456 


PROO380 


KINSSIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 l.QOOe- 
25 77-99 PR00380D 
9.93 1.000e-21 281-303 
PR0O380C 13.18 8.2B6e- 
17 230-249 PR00380B 
12.64 4.724e-16 194- 
212 


457 


PR00253 


GAMMA-AMINOBUTYRIC ACID 
(GAB A) RECEPTOR 
SIGNATURE 


PR00253A 9.15 9.143e- 
24 246-267 PR00253B 
13.47 2.000e-23 272- 
294 PR00253C 13.85 
7.000e-23 306-328 
PR00253D 16.68 5.950e- 
21 452-473 


457 


PR00849 


GLYCOSYL HYDROLASE 
FAMILY 58 SIGNATURE 


PR00849D 9.77 9.236e- 
09 910-937 


471 


BL0067B 


Trp-Asp (WD) repeat 
proteins proteins. 


BL0Q678 9.67 B.200e-12 
33-44 


472 


BL0022C; 


Intermediate filaments 
proteins . 


BL00226B 23.86 3.72le- 
09 282-330 


473 


BL00344 


GATA-type zinc finger 
domain proteins. 


BL00344 17.99 7.000e- 
12 814-852 


474 


BL00481 


Thiol -activated 
cytolysins proteins. 


BL00481E 13.07 B.909e- 
09 173-199 


479 


PR00319 


"BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11 47 2 c;7l«_ " 

09 393-408 


480 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.900e- 
38 8-47 


481 


PR00405 


hiv rev Interacting 
protein signature 

r 


V PR00405C 19.41 l.OOOe- 

11.83 4.333e-18 430- 
448 PR0040SA 17.71 
4.971e-18 411-431 


482 


PRQ0049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.286e- 
10 959-974 PR00049D 
0.00 9.857e-10 958-973 
PRO0O49D 0.00 1 30*5^- 
09 937-952 PR00049D 
0.00 8.322e-09 939-954 


486 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 

* 


PR00007B 14.16 8.615e- 
23 653-673 PR00007A 
19.33 6.192e-22 626- 
653 PR00007C 15.60 
5.B46e-19 698-720 
PR00007D 9.64 3.647e- 
13 732-743 


487 


PD00567 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567B 18.23 2.853e- 
09 200-214 


488 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e- 
12 3-21 


489 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL - 
BINDING NU. 


PD01066 19.43 4.882e- 
27 30-69 PD01066 
19.43 3.430e-10 71-110 


490 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.864e- 
09 663-678 


492 


BL01128 


Shikimate kinase 
proteins . 


BL01128A 18.84 6.464e- 
17 58-92 


497 


PF00429 


ENV polyprotein (coat 


PF00429 31.08 7.l71e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






polyprotein) . 


15 21-71 


498 


BL00120 


Lipases, serine 
proteins. 


BL00120B 11.37 ?.923e- "' 
09 185-200 


500 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 7.353e- 
11 299-318 


"501 


BL01159 


WW/rsp5/WWP domain 
proteins. 


BL01159 13.85 8.579e- 
12 131-146 


505 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 3.739e- 
17 492-510 


"508 


PR00120 


H+TRANS PORTING ATPASE 
(PROTON PUMP) SIGNATURE 


PR00120C 9.90 5.800e- 
19 705-722 


509 


DM014 17^ 


6 kw INDUCING X?MC2 
MUSHROOM SPAC22G7.04. 


DM01417E 20.62 2.938e- 
16 362-395 DM01417D 
11.08 3.800e-13 322- 
338 


sio 


&F00534 


Glycosyl transferases 
group 1. 


PF00534B 14.47 6.625e- 
09 346-370 


511 


PP00534 


Glycosyl transferases 
group l. 


PF00534B 14.47 e.t^Se- 
09 293-317 


512 


PF00534 


Glycosyl transferases 
group l. 


PF00534B 14.47 6.625e- 
09 366-390 


513 


PD01841 


PHOS PHOR YLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 110-160 PD01B41B 
14.35 1.000e-40 181- 
222 PD01841D 17.87 
1.000e-40 243-295 
PD01841F 13.36 l.DOOe- 
40 333-382 PD01841G 
24.26 1.0DOe-40 386- 
440 PD01841L 18.42 
1.000e-40 968-1010 
PD01841I 23.00 4.545e- 
37 762-804 PD01841E 
18.60 3.750e-36 295- 
333 PD01B41J 14.94 
6.023e-35 851-888 
PD01841H 21.30 2.909e- 
33 490-527 PD01841K 
14.81 7.088e-33 924- 
954 PD01841C 13.78 

PD01841M 10.82 8.594e- 

*J- xuosi — J.V / J rUUXo4xI 

23.00 2.667e-13 549- 
591 


514 


PRO 0153 


CYCLOPHILIN PEPTIDYL- 
PROLYL CIS -TRANS 
ISOMERASE SIGNATURE I 


PR00153C 11.01 7.1B8e- 
13 95-111 PR00153E 
9.10 4.150B-12 122-13H 


515 


BL00740 


MAM domain proteins. 


BL00740A 13.87 7.18Be- 
12 410-423 


516 


DMO 08 92 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 6.087e- 
12 1018-1052 


517 


BL00242 


Integrins alpha chain 
proteins. 


BL00242C 16.86 8.320e- 
09 12-42 


523 


DM00031 


IMMUNOGLOBULIN 1 V REGION.- 


DM00031A 16.80 3.7S0e- 
39 20-68 DMOOO^TR 
15.41 1.000e-25 84-118 


TSs 


BL00319 


Amyloidogenic 
glycoprotein 
extracellular domain 
proteins. 


BL00319C 17.12 B.375e- 
10 61-95 


526 


PF007B9 


Domain present in 
ubiqu i t i n - regulat ory 
proteins. 


PF00789B 19.70 3.308e- " 
12 322-343 PF00789C 
20.98 5.269e-09 367- 
392 


526 


BL01162 


Quinone oxidoreductase / 

zeta-crystallin 

proteins. 


BX01162C 22.80 1.500e- " 
16 120-164 
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SEQ ID NO: 


ACCESSION 
NO. 


1 DESCRIPTION 


RESULTS* 


529 


PRO 0910 


LUTEOVIRUS ORP6 PROTEIN 
SIGNATURE 


PR00910A 2.51 3.893e- 
09 60-73 


S32 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15\B2 4.000e- 
17 11-36 BL0O215A 
15.82 8.660e-ll 123- 
14 8 


533 


BL00215 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15. B2 4.000e- 
17 11-36 BL00215A 
15.82 8.660e-ll 97-122 


534 


BL0009B 


Thiolases acyl- enzyme 
intermediate proteins. 


BL00098C 2X.65 2.800e- 
38 181-227 BL0009BB 
32.59 5.345e-38 86-141 
BL00098D 26.30 8.364e- 
35 245-288 BL00098E 
22.12 1.000e-34 314- 
352 BL00098F 10.18 
4.971e-22 365-386 
BL00098A 10.60 6.455e- 
11 38-50 


S3* 


PRO 03 70 


FLAVIN- CONTAINING 
MONOOXYGENASE (FMO) 
SIGNATURE 


PR00370E 11.96 7.429e- 
22 321-340 PR00370D 
16.33 6.143e-21 185- 
204 PR00370F 17.75 
6.559e-21 376-396 
PR00370B 10.91 9.591e- 
21 27-46 PR00370C 
12.72 3.500e-20 140- 
157 PR00370A 3.35 
6.4426-17 4-20 


536 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 

T 


BL00028 16.07 7.429e- 
16 285-302 BL00028 
16.07 6.294e-14 341- 
3S8 BL00O28 16.07 
1.346e-ll 369-386 
BL0002B 16.07 1.692e- 
11 397-414 BL00028 
16.07 4.4S2e-ll 453- 
470 BL00029 16.07 
7.231e-ll 425-442 
BL00028 16.07 4.300e- 
10 313-330 r 


537 


BL00762 


WHEP-TRS domain 
proteins. 


BL00762A 23.43 9.419e- 
15 844-881 


£ 3 8 


BL0O762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 819-B56 


539 


BL007£^ 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 822-859 


540 


PR00985 


LEUCYL-TRNA SYNTHETASE 
SIGNATURE 


PR0O985A 12.10 9.000e- 
10 357-375 


541 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL. 


PD02102A 16.74 l.OOOe- 
40 3-47 PD02102B 
18.28 4.375e-34 57-100 
PD02102D 21.69 1.923e- 
30 179-218 PD02102C 
26.34 8.929e-26 100- 
146 


543 


BL0Q028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 1<S.07 l.OOOe- 
10 48-65 BL00028 
16.07 6.400e-10 193- 
210 BL00028 16.07 
l.OOOe-09 343-360 
BL00028 16.07 6.914e- 
09 78-95 


545 


BL00250 


TGF-beta family 
proteins. 


BL00250A 21.24 B.OOOe- 
31 293-329 BL00250B 
27.37 5.286e-24 354- 
390 


547 


PR00319 


BETA G- PROTEIN 


PR00319B 11.47 2.714e- 
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SEQ ID NO; 


""ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






(TRANSDUCIN) SIGNATURE 


09 196-201 PR00319A 
15.27 7.344e-09 210- 
227 


548 " 


BL01204 


NF- kappa - B/Rel / dorsal 
domain proteins. 


RL01204A 17.74 l.OOOe- 
40 8-56 BL01204D 
16.42 1.000e-40 177- 
221 BL01204E 13.83 
7.652e-30 225-250 
BL01204C 13.93 8.714e- 
22 141-160 BL01204B 
15.41 4.333e-16 102- 
116 


549 


PR00326 


GTP1/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.364e~ 
15 255-276 


551 


PF00632 


KECT- domain (ubiguitin- 
transf erase) . 


PF00632C 20.66 3.302e- 
23 1569-1601 PF00632B 
18.45 3.700e-21 1515- 
1543 


554 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 1.600e- "~ 
14 187-205 BL00290A 
20.89 2.059e-14 130- 
153 


557 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.339e- 
09 846-879 


559 


DM61111 


4 Jew PHOSPHATASE 
TRANSFORMING 61K PDF1 . 


DM01111L 11.93 3.762e- 
09 7-35 


562 


PP00658 ~ 


Poly-adenylate binding 
protein, unique domain 
proteins . 


PF00658C 16.33 9.455e- 
32 118-155 


564 


BL00141 


Eufcaryotic and viral 
aspartyl proteases 
proteins . 


BL00141A 12.10 4.150e-'" 
10 472-488 


566 


PF00855 


pwwp domain proteins. 


PF00855 13.75 S.667e- 
15 272-289 


567 


PD01056 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NO". 


PD6l0ti« 19.43 4.977e- 
13 225-26B 


569 


BL00107 


Protein kinases AT?- 
binding region proteins. 


BL00107A 18.39 7 . OOOe- 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
199 


570 ■ 


BL00107 - 


Protein kinases ATP- i 
binding region proteins. 


"BL00107A 18.39 7.000e- 
19 116-149 BL00107B 
13.31 5.500e-15 183- 
199 


572 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.857e- 
34 454-483 PR00193C 
12.60 2.636e-31 223- 
251 PR00193B 11.69 
7.750e-29 171-197 
PR00193A IS. 41 2.58 8e- 
22 115-135 PR00193E 
19.47 6.559e-19 508- 
537 


573 


PR00193 ~" 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.8S7e- 
34 470-499 PR00193C 
12.60 2.636e-31 239- 
267 PR00193B 11.69 
7.750e-29 171-197 
PRO0193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 524- 
553 


"575 


BL00752 


XPA protein. 


DLO0752B 19.17 9.703e- 
10 885-929 


576 


BL0003 0 


EuXaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 7.000e- 
09 276-295 


577 


BL00116" 


DNA polymerase family B 


BL00116A 12.81 5.737e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins . 


13 864-877 BL0011SB 
11.82 l.S29e-12 952- 
965 


578 


BL00195 


Glutaredoxln proteins . 


BL00195B 15.31 7.158e- 
09 121-141 


579 


PR00019 


LEUCINB-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 9.000e- 
11 217-231 PR00019B 
11.36 1.360e-09 386- 
400 PR00019A 11.19 
3 333e-09 iflQ-din 
PR00019B 11.36 8.920e- 
09 363-377 


580 


PR00253 


GAMMA- AMINOBLJTYR I C ACID 
(GABA) RECEPTOR 
SIGNATURE 

i 


PR00253A 9.15 2.125e- 
25 275-296 PR00253B 
13.47 7.923e-24 301- 
323 PR00253D 16.68 
S.846e-23 444-465 
PR00253C 13.85 2.241e- 
20 335-357 


583 


PR00343 


SELECT IK SUPERFAMtLY 
COMPLEMENT- BINDING 
REPEAT SIGNATURE 


PR00343C 16.85 2.2B6e- 
11 1233-1252 PR00343C 
16.85 5.500e-ll 333- 
352 PR00343C 16. 8S 
5.500e-ll 783-802 
PR00343C 16.85 4.246e- 
10 1491-1510 PR00343C 
16.85 8.230e-10 1686- 
1705 


"58T 


DM01537 


Ww SPCT5W KKT9 NTTf"*Ti.Pnr.&o 
HELICASS . 


DM01537B 21.63 1.878e- 
37 79-126 DM01537B 
21.63 9.491e-30 916- 
963 DM01537A 15.14 


586 


PF0O013 


KH domain proteins 
family of RNA binding 
proteins . 


PF00013 5.78 1.450e-09 


587 


DM00892 


3 RETROVIRAL PROTEINASE. 


13 262-296 


"589 


BL0047"8 


LIM domain proteins . 

■ T 


dt ftfti "7 on i a no t £ a •> t*. 
oJjUU? /BJB 14.79 1.643e- 

13 261-276 BL00478B 
14.79 7.709~e-09 321- 
336 


590 


PF00855 


PWWP domain proteins. 


PF00855 13.75 S.OOOe- 
15 931-948 




ptfooeSS 


PWWP domain proteins. 


PF00B55 13.75 8.000e- 


593 


PF0062B 


" PHD- finger: 


PF00628 15.84 3.455e- 

TO AOa /no j 


594 


PRO 02 05 


CADHERIN SIGNATURE 


PR00205B 11.39 2.241e- 

— *» 330-3/0 fKUUZUDA 

14.73 9.308e-13 542- 
558 PR00205C 13.65 
5.304e-12 594-609 
PR00205B 11.39 4.273e- 
10 336-354 


596 : 


BL0O107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.789e- 
18 307-338 


*98 


PD01675 


GLYCOPROTEIN MAJOR 
ENVELOPE PROBABLE U3 . 


PD01675C 19.89 2.330e- 
10 55-39 


6*06 


BL00242 


Integrins alpha chain 
proteins . 


BL00242E 9.03 9.591e- 
27 985-1014 BL00242C 
16.86 4.1lSe-26 286- 
316 BL00242D 13.57 
4.150e-25 357-382 
BL00242B 8.13 7.353e- 
12 189-199 BL00242D 
13.57 3.455e-ll 421- 
446 BLO0242A 13.80 1 
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SEQ 10 NO : 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








5.0006-11 61-73 
BL00242D 13.57 4.9S6e- 
10 291-316 




■rR00320 


li-FKUTEiN BETA WD- 40 
REPEAT SIGNATURE 


PRO0320A 16.74 5.610e- 
09 198-213 






SIGNATURE 


PRO0278A 12.43 4.569e- 
10 331-34B 


Cfl"4 




Phorbol eaters / 
diacylglycerol binding 
domain proteins . 


BL00479C 12.01 3.250e- 
12 170-183 


604 


BL00315 


Dehydrins proteins. 


BL0031SA 9.35 1.672e- 
09 424-452 


605 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e- 
10 295-339 


606 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 l.OOOe- 
13 335-358 


608 


PF00BS5 


PWWP domain proteins. 


PF00855 13.75 5.l67e- 
15 265-2B2 ! 


609 


PF00B55 


PWWP domain proteins. 


PF00855 13.75 5.167e- 
15 211-228 


612 


DM01206 


CORONAVIRUS NUCLEOCAPSID . 
PROTEIN. 


DMO1206B 10.69 7.411e- 
10 877-897 DM012D6B 
10.69 8.027e-10 861- 
881 DM01206B 10.69 
9.137e-10 B73-893 
DM01206B 10:69 1.456e- 
09 859-879 DM01206B 
10.69 1.797e-09 879- 
899 DM01206B 10.69 
4.076e-09 865-885 
DM01206B 10.69 7.038e- 
09 898-91B DM01206B 
10.69 7.949e-09 871- 
891 DM01206B 10.69 
B.291e-09 767-787 


615 


PD02699 


PROTEIN DNA-B IN DING 
BINDING DNA. 


PD02699A B.S1 2.023e- 
28 129-158 PD02699C 
24.84 1.000e-27 317- 
36,4 PD02699B 18.28 
I.000e-17 158-182 


616 


PRO 03 80 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14. IB 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
455' 


617 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.0B6e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.97Se-13 436- 
455 


618 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN . 


DM012C6B 10.69 5.143e- 
12 531-551 DM01206B 
10.69 2.603e-10 535- 
555 


621 


PR00700 


PROTEIN TYROSINE 

rtivotr ttninoa oA\j£iAd Vila 


PR00700B 16.80 3.160e- 
21 bol-5o2 


622 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239F 28.15 3.222e- 
10 647-692 BL00239C 
18.75 8.304e-10 543- 
566 


623 


PRO 04 07 


EUKARYOTIC M0LYBD0 PTERIN 
DOMAIN SIGNATURE 


PR00407K 9.94 8.448e- 
09 326-339 


624 


BL00641 


Respiratory- chain NADH 
dehydrogenase 75 Kd 


BL00641C 21.10 l.OOOe- 
40 157-202 BL00641E 
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SEQ ID NO: 


NO. 










subunxt proteins. 


24.37 1.000e-40 255- 
308 BL00641F 33.12 
1.000e-40 571-623 
BL00641A 17.15 1.818e- 
37 48-80 BL00641B 
12.62 5.B46e-34 113- 
139 BL00641D 13.23 
9.308e-29 216-240 


6*27 


PR00103 


CAMP -DEPENDENT PROTEIN 
KINASE SIGNATURE 


PR00103B 17.80 2.500e- 
18 367-380 PR00103B 
13.39 2.080e-14 297- 
312 PR00103A 9.59 
2.957e-14 282-297 
PR00103D 10.83 3.077e- 
12 346-358 PR00103C 
15.68 1.000e-ll 334- 
344 PR00103B 13.39 
1.450e-ll 175-190 
PR00103A 9.59 1.720e- 
10 160-175 


630 


PR00081 


GLtlCOSE/RiBltOt 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081A 10.53 6 Alle- 
le 4-22 


631 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 B.500e- 
14 37-50 


"632 




CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.^9 2.233e- 
10 1324-1344 DM01206B 
10.69 4.822e-10 1276- 
1296 DM01206B 10.69 
7.658e-10 1328-1348 
DM01206B 10.69 8.274e- 
10 1280-1300 DM01206B 
10.69 4.532e-09 1320- 
1340 DM01206B 10.69 
7.266e-09 1326-1346 


$35" 




Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 7.600e- 
23 145-176 BL00107B 
13.31 2.636e-13 211- 
227 


636 


BL00657 


Fork head domain 
protein's. 


BL00657A 19.39 1.545e- 
30 101-14J 1 BL00657B 
22.27 7.750e-26 149- 
192 


637 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 l.OOOe- 
10 607-623 


"643 


BL00018 


EF-hand calcium- binding 
domain proteins. 


BL00018 7.41 4.913e-09 
199-212 


647 


PF00628 


PHD-finger. 


PF0062B 15.84 2.350e- 
13 385-400 PF00628 
15.84 3.455e-12 464- 


Tab 


BL01129 


Hypothetical 
vabO/vceC/fif hB Eamilv 
proteins . 


BL01129E 13.25 4.000e- 

25.56 8.200e-23 236- 
279 BL01129B 12.51 
6.118e-13 191-212 


649 


BL0122 8 | 


Hypothetical cof family 
proteins. 


BL0122BD 17.44 3.908e- 
10 455-480 


650 


BL0002 7 


'Horaeobox' domain 
proteins. 


BL00027 26.43 6.684e- 
13 771-814 


651 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002A 14.19 1.750e- 
12 1026-1045 


653 


PR00253 


QAMMA-AMINOBUTYRiC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 4.000e- 
24 253-274 PR00253C 
13.85 8.800e-24 313- 
335 PR00253B 13-47 
3.143e-22 279-301 
PR00253D 16.68 7.652e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 422-443 


654 


PD01719 


1 PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 4.452e- 
11 969-997 PD01719A 
12.89 3.961e-10 128- 
156 PD01719A 12.89 
7.395e-10 1276-1304 
PD01719A 12.89 1.222e- 
09 1220-1248 


657 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(AhooJc) . 


BL00354C 6.61 8.397e- 
09 563-578 


£58 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 B.397e- 
09 580-S95 


659 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2 . 174e- 
13 539-572 DM00215 
19.43 4.750e-12 549- 

582 DM00215 19.43 
9.824e-ll 551-5B4 
DM00215 19.43 2.929e- 
10 54B-581 DM00215 
19.43 4.054e-lC 550- 

583 DM00215 19.43 
5.339e-10 552-585 
DM0021S 19.43 7.107e- 
10 544-577 


660 


PR00688 


XYLOSE ISOMERASE 
SIGNATURE 


PR00688I 13.78 9.51Be- 
09 224-236 


661 


BL00027 


• Homeobox 1 domain 
proteins. 


BL00027 26.43 5.950e- 
23 249-292 


662 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


663 


PROO3 60 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158a- 
10 596-610 


664 


PRO 03 60 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.1S8e- 
10 596-610 


466 


PR00819 


CBXX/CPQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 8.900e- 
10 704-720 


667 


BL50040 


Elongation factor 1 
gamma chain profile. 


BL50040C 22.62 2.143e- 
16 135-170 


668 


PR00019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR60bi^ 11.36 1.36-Oe- 
09 139-153 PR00019A 
11.19 1.667e-09 94-108 
PR00019B 11.36 4.600e- 
09 163-177 


670 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL0001B 7.41 3.250e-10 
681-694 BL00018 7.41 
6.400e-10 717-730 


672 


PD00131 


ATP -BINDING TRANSPORT 
TRANSMEMBR. * 


PD00131B 34.97 l.OOOe- 
34 356-410 PD00131C 
19. S9 1.346e-26 504- 
542 


673 ■ 


PR0O667 


RETINAL PIGMENT 
EPITHELIUM -RETINAL GPCR 
SIGNATURE 


PR00667G 15.33 7.5S7e- 
10 106-123 


674 


PRO 03 20 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 593-608 PR00320B 
12.19 4.115e-12 635- 
650 PR00320C 13.01 
8.435e-ll 717-732 
PR00320C 13.01 2.800e- 
10 635-650 PR00320C 
13.01 6.400e-10 593- 
608 PR00320B 12.19 
3.2S0e-09 593-608 


675 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 572-587 PR00320B 
12.19 4.1l5e-12 614- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








629 PR00320C 13.01 
8.435e-ll 696-711 
PR00320C 13.01 2.800e- 
10 614-629 PR0O32OC 
13.01 6.400e-10 572- 
5B7 PR0032OB 12.19 
3.250e-09 572-S87 | 


676 


PRO 00 19 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9.667e- 
09 249-263 


679 


PF00642 


Zinc finger C-x8-c-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 3.700e- 
16 225-236 PF00642 
11.59 7.900e-12 187- 
198 


6B0 


PR0030B 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR003O8C 3.83 8.754e- 
10 286-296 


681 


BL00019 


Actinin-type actin- 
binding domain proteins. 


"BL00019D 15.33"O00e- 
19 227-257 


"682 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 4.0O0e~ " 
09 99-118 


687 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 B.SOOe- 
10 538-553 


689 


BL01024 


Protein phospnatase 2A 
regulatory subunit PR55 
proteins . 


BL01024A 10.26 l.OOOe- 
40 22-69 BL01024B 
8.91 l.OOOe-40 86-127 
BL01024C 7.80 l.OOOe- 
40 146-185 BL01024D 
13.22 l.OOOe-40 185- 
222 BL01024E 11.96 

I. OOOe-40 222-266 
BL01024F 9.42 l.OOOe- 
40 266-317 BL01024G 

II. 09 l.OOOe-40 317- 
349 BL01024H 13.88 
l.OOOe-40 389-442 


691 - 


BL00027 


'Horaeobox* domain 
proteins. 


BL00027 26'. 43 8.071e- 
31 152-195 


692 


BL00211 


ABC transporters family 
proteins. 


BL00211A 12.23 5\050e- 
09 45-57 


693 


BL002I1 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 45-57 


694 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 58-70 


696 


BL006 8 0 


Methionine 

aminopeptidase subfamily 
1 proteins. 


BL006BO 14.37 5.304e- 
17 173-195 


697 


BL00741 


Guanine- nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 3.41Be- 
11 242-265 


698 


DM01930 


2 kw FINGER SMCX SMCY 

YDR096W. 


DM01930E 15.41 1.367e- 
37 170-215 DM01930F 
14.16 B.232e-28 267- 
303 DM0193 OB 19.86 
9.163e-10 37-71 


700 


PR0O869 


DNA- POLYMERASE FAMILY X 
SIGNATURE 


PR00869A 12.80 1.281e- 
16 245-263 


701 


PR0O048 * 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR0004BA 10.52 2.174e- 
10 77-91 PR00048A 
10.52 6.870e-10 133- 
147 PR00048A 10.52 
8.826e-10 105-119 
PR00048A 10.52 5.320e- 
09 161-175 


702 


BL00523 


Sulfatases proteins. 


BL00523E 19.27 2.565e- 
25 326-356 BL00523A 
13.36 5.050e-16 38-55 
BL00523B 8.64 5.909e- 
IS 86-98 BL00523C 
12.64 5.500e-13 137- 
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seq ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








14B BL00523D 9.89 
1.844e-ll 290-302 
BL00523G 9.46 5.500e- 
10 513-523 BL00523F 
10.85 6.351e-09 413- 
424 


703 


PRO0048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10,52 8.412e- 
12 376-390 PR0004BB 
6.02 1.000e-10 334-344 
PR0004BB 6.02 1.474e- 
09 364-374 


707 


PD00787 


SYNTHASE BIOSYNTHESIS 
TRANSFERASE. 


PD00707A 14.84 8.941e- 
14 66-82 


708 


PR007S1 


BIND IN PRECURSOR 
SIGNATURE 


PR00761E 14.32 8.500e- 
10 822-841 


712 


"DM01354 


kw TRANSCRIPTASE REVERSE - 
II ORF2. 

| 


DM013S4Y 10.69 4.977e- 
38 425-465 DM01354X 
13.86 7.300e-34 376- 
415 DM01354V 12.97 
4.923e-17 311-358 
DM01354 W 12.64 5.596e- 
10 356-376 


713 " 


BLOQ039 


DEAD -box subfamily ATP- 
dependent helicaaes 
proteins. 


BL00039D 21.67 7.545e- 
27 450-496 BL00039A 
18.44 2.537e-18 147- 
186 BL00039C 15.63 
2.2l6e-14 280-304 
BL00039B 19.19 1.947a- 
13 194-220 


715 ■' " 


BLO0383 


Tyrosine specific 
protein phosphatases 
proteins. 


BL00383E 10.35 4.981e- 
10 150-161 


717 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 4.035e- 
21 106-161 


718 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 3.750e- 
39 20-68 DM00031B 
15.41 2.688e-2B 04-11B 
DM00031C 12.79 1.300e- 
12 131-142 


719 


BLO0243 


Integrins^ beta chain 
cysteine-rich domain 
proteins. 


BL00243B 17,. 54 l.OOOe- 
40 131-172 BL00243C 
16.42 1.000e-40 172- 
208 BL00243D 24.07 
1.000e-40 222-274 
BL00243F 22.63 l.OOOe- 
40 314-356 BL00243I 
31.77 6.571e-39 607- 
650 BL00243E 16.70 
3.077e-35 274-304 . 
BL00243G 21.38 3.625e- 
34 358-400 BL00243H j 
17.53 5.235e-29 567- 
593 BL00243A 17.61 
3.250e-21 63-84 
BL00243H 17.53 7.167e- 
16 477-503 BL00243H 
17.53 2.304e-ll 524- 
550 BL00243H 17.53 
5.304e-ll 606-632 
BL00243I 31.77 1.380e- 
09 610-653 


720 


PRO 02 17 


43 KD POSTSYNAPTIC 
PROTEIN SIGNATURE 


PR00217C 10.91 8.022e- 
09 20-36 


722 


PR00704 


CALPAIN CYSTEINE 
PROTEASE <C2) FAMILY 
SIGNATURE 


PR00704D 11.05 5.909e- 
34 135-161 PR00704F 
13.61 7.000e-26 190- 
218 PR00704E 12.55 
8.071e-26 165-189 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PR00704B 17.94 2.241e- 
23 75-98 PR00704A 
14.68 4.094e-19 30-54 
PR0O704C 11.88 l.B71e- 
18 99-116 


725 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


726 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


727 


PRO 03 20 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 ?..125e- 
13 277-292 PR00320A 
16.74 1.310e-ll 277- 
292 PR00320C 13.01 
4.522e-ll 323-338 
PR00320A 16.74 6.586e- 
11 323-338 PRO032OB 
12.19 4.343e-10 323- 
33B PR00320B 12.19 
6.914e-10 277-292 . 


7*1 


PR06i*)S 


DYNAMIN SIGNATURE 


PR00195A 11.94 8.6276^" 
16 288-307 PR00195E 
9.82 3.912e-ll 457-474 


733 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.082e- 
10 787-798 


738 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL00039A 18.44 2.5S5e- 
28 26-65 BL00039D 
21.67 2.105e-20 338- 
384 BL00039C 15.63 
9.100e-13 160-184 
BL00039B 19.19 9.617e- 
11 73-99 


739 


BL01289 


TSC-22 / dip / bun 
family proteins. 


BL01289A 12.18 8.909e- 
31 326-353 BL01289B 
10.45 9.571e-17 353- 
383 


742 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 7.07Be- 
12 41-81 


743 


BL00955 


Phosphomannose isomerase 
type I proteins. 


BL00965C 23.78 1.000a- 
40 256-305 BL00965B 
17.77 1.600e-25 126- 
153 BL00965A 10.57 
6.400e-19 94-113 


747 


BL00021 


Kringle domain proteins. 


BL00021D 24.56 4.563e- 
25 231-273 BL00021B 
13.33 5.345e-21 60-78 


748 


BL00*12' 


Osteonectin domain 
proteins. 


BL00612B 11.35 2.034e- 
11 93-126 


749 


PR00450 


RECOVERIN FAMILY 
SIGNATURE 


PR00450C 12.22 6.880e- 
10 135-157 


752 


BL00795 


Involucrin proteins. 


BL00795C 17.06 6.000e- 
11 384-429 BL00795C 
17.06 9.444e-ll 370- 
415 


754 


BL00051 


Ribosomal protein L39e 
proteins. 


BL00051 20.92 1.935e- 
16 4-50 


755 


DM01970 


0 Jew ZK632.12 YDR313C 
ENDOS0MAL III. 


DM01970B 8.60 7.723e- 
09 171-184 


760 


BL01020 


SARI family proteins. 


BL01020C 15.35 9.020e- 
12 99-150 


762 


3L0004 6 


mi scone H2A proteins . 


BL0004S 12.95 l.OOOe- 
40 33-88 


■7*3- 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 9.137e- 
10 206-240 


764 


BL00027 


' Homeobox 1 doma in 
proteins . 


BL00027 26.43 8.8O0e- 
29 417-460 


767 


BL01208 


VWFC domain proteins. 


BLO1208B 15.83 6.063e- 
10 309-324 BL012O8B 
15.83 8.031e-10 165- 
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| SEQ ID NO: 


ACCESSION 

. NO. 


DESCRIPTION 


RESULTS* 








180 BL01208B 15.83 
4.l62e-09 85-100 


770 


" BL00031 


~ Nuclear hormones " 
receptors DNA-binding 
region proteins. 


BL00031A 19.55 5.571e- 
32 -208-241 BL00031B 
22.25 5.500e-27 242- 
274 


772 


PRO 04 4 9 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.4S0e- 
18 4-26 PR00449E 
13.50 3.520e-14 142- 
165 PR00449C 17.27 
3.032e-13 44-67 
PR00449D 10.79 8.579e- 
13 107-121 PR00449B 
14.34 3.4S5e-ll 27-44 


773 


BL00523 




ojuuub^^a 19.27 9.333e- 
23 299-329 BL00523A 

•i-Ji JO 6 .£UUc-*13 47-64 

BL00523B 8.64 2.607e- 

9.e9 7.923e-12 224-236 

10 141-152 BL00523F 
10.85 5.821e-10 373- 
384 


775 


BL0O028 


Zinc finger, C2H2 type, 
domain proteins. 


ouyuuiO ID . u / / . hooB" 

09 568-585 


77d 


BL0O028 


zxnc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 7.686e- 
09 621-638 


777 


"BL00028 : 


Zxnc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- 
09 595-612 


778 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 8.412e- 
11 322-341 BL00030A 
14.39 7.000e-10 220- 


779 

J 


PR0D079 


GLUCOSE - £ - PHCS PHATE 

DEHYDROGENASE SIGNATURE 


PR00079B 12.98 2.929e- 
26 193-222 PR00079E 
16.65 4.l50e-23 348- 
375 PR00079C 8.68 
6.351e-16 246-264 
PR00079D 13.51. 7.070e- 
lb 254-281 PR00079A 
16.12 6.769e-13 169- 
183 


781 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 9.250e- 
17 10-35 BL00215A 
15.82 6.000e-16 221- 
246 BL00215A 15.82 
7.857e-12 108-n-» 
BL00215B 10.44 9.526e- 
11 168-181 


783 


PD002D9 


PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


PD00289 9.97 6.276e-09 
159-173 


785 
L 


BL00690 


DKAH-box sunfamily ATP- 
dependent helicases 
proteins . 


BL00690B 13.38 l.OOOe- 
12 147-165 BL00690A 
6.87 5 320e-lD ii4.i?d 
BL00690C 7.51 3.189e- 
09 218-228 


786 


PR00449 


TRANSFORMING PROTEIlTP2l 

RAS SIGNATURE 


PR00449C 17.27 B.500e- 
16 50-73 PR00449A 
13.20 5.235e-14 8-30 
PR00449E 13 .50 2.853e- 
11 150-173 PR00449D 
10.79 1.545e-09 111- 
125 


788 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


UM01206B 10.59 8.767e- 
10 1-21 


790 J 


tiL00915 


pnosphatidylinositol 3- ] 
and 4 -kinases proteins. 


SL00915C 22.43 9.182e-" 
39 725-764 BL00915B 
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SBQ ID NO; 


ACCESSION 
NO. 


DESCRIPTION 


" RESULTS* 








22 .78 5.0b0e-33 633- 
671 BL00915D 27.02 
1.529e-21 795-831 
BL00915A 10.09 l.OOOe- 
13 395-407 


791 


PRO0208 

/ 


GLIADIN AND LMW GLUTEN It N 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 S.294e- 
10 120-138 PR00208A 
12.59 6,294e-10 121- 
139 PR00208A 12.59 
6.294e-10 122-140 
PR00208A 12.59 6.294e- 
10 123-141 PR0O2OBA 
12.59 6.294e-10 124- 
142 PR00208A 12.59 
6.294e-10 125-143 
PR00208A 12.59 6.294e- 
10 126-144 PR0O2O8A 
12.59 6.294e-10 127- 
145 PR00208A 12.59 
6.294e-10 128-146 

10 129-147 PR00208A 
12.59 7.411e-09 130- 
148 PR00208A 12.59 
7.658e-09 131-149 
Donn^nnn i ^ co *» ar\A 

rKUUZUOA .37 /,JfQ4e- 

09 132-150 PR00208A 
12.59 8.274e-09 118- 
136 PR00208A 12.59 
B.274e-09 119-137 


795 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 5.034e- 
16 302-320 PR00205A 
14.73 1.257e-ll 284- 
300 PR00205C 13.65 
1.333e-ll 337-352 


796 

• 


BL00412 


Neuramodul i n fGAP - a*i \ 
proteins . 

r 


auU\J s ± ±4U Id . b*t 4 * QOOC'* 

12 196-247 BL0O412D 

D./UDe-Jil 137 — 

248 BL00412D 16.54 
7.848e-10 199-250 
BL00412D r 16.54 l.B27e- 
09 195-246 BL00412D 
16.54 1.918e-09 194- 
245 BL00412D 16.54 


797 


BL00021 


Kr ingle domain proteins. 


BL00021B 13.33 6.339e- 
13 40-58 


799 


BL010S2 


Caiponin family repeat 
proteins . 

• 


BL01052C 18.51 l.OOOe- 
40 87-127 BL01052A 
16.12 1.529e-32 3-35 
BL01052B 15.31 1.257e- 
25 52-78 BL01052D 
10.26 5.737e-2S 174- 
194 


800 


BL00348 


p53 tumor antigen 
proteinB. 


BL00348F 23.19 3 . 714e- 
09 197-240 


801 


DL00309 


Vertebrate galactoside- 
binding lectin proteins. 


BL00309C 18.65 1.621e- 
09 62-87 


802 


PRO 024 5 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245D 10.47 5.224e- 
09 187-199 


804 


PP00774 


Dihydropyxidine 
sensitive L-type calcium 
channel (Beta subuni. 


PF00774A 16.47 B.457e- 
10 110-156 


808 


PR00667 


RETINAL PIGMENT 
EPITHELIUM -RETINAL GPCR 
SIGNATURE 


PR00667C 11.71 9.875e- 
09 12-28 


810 


PD02344 


PHOTOSYSTEM II PROTEIN 
PRECURSOR j 


PD02346F 12.89 4.340e- 
09 317-354 [ 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


"' RESULTS* 1 






PHOTOSYNTHESIS. 




811 


BLO0S85 


CBF-A/NF-YB subunit 
proteins . 


BL00685B 14.41 6 . 779e- 
14 54-95 BL00685A 
11.22 4.798e-l3 5-54 


812 


PR0008O 


ALCOHOL DEHYDROGENASE 
SUPERFAMILY SIGNATURE 


PR00080A 9.32 9.419e- 
10 93-105 


813 


BLO0357 


Histone H2B proteins. 


BL003S7 7.74 1.988e-17 
22-65 


815 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 7.923e- 
15 158-171 PD00066 
13.92 5.200e-l4 46-59 
PD00066 13.92 7.000e- 
•14 18-31 PD00066 

^ AO 1 ft n n A 1 t ■> f\ 

13.92 7.0O0e-13 130- 
143 PD00066 13.92 
7.500e-13 214-227 
PD00066 13.92 9.000e- 
13 102-115 PD00066 

* 1 AO A A O ft *t *\ ^ ft /- 

13.92 4 .429e-12 18b- 
199 PD00066 13.92 
1.7B3e-ll 74-87 


816 


BL01195 


Peptidyl-tRNA hydrolase 
proteins. 


BL01195C 20.12 3.348e- 
20 100-139 




tJLiUOb ZD 


Interleukin-10 family 
proteins . 


BL00520A 6.21 6.471e- 
09 1-14 


822 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 B.113e- 
09 224-242 


QIC 


PKQOBvo 


NEMATODE METALLOTH I ONE IN 
SIGNATURE 


PR00876B 7.66 2.268e- 
10 101-11S 


829 


PD02855 


FLAVOPROTEIN PROTEIN 
DNA/PANTOTHEN. 


PD02855A 18.37 4.732o- 
28 88-124 PD02855B 
8.36 6:478e-09 132-142 


830 


PRO 04 05 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 7.000e- 
21 44-62 PR00405C 
19.41 1.0O0e-13 65-87 
PR00405A 17.71 7.283e- 
13 25-45 


831 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 l.OOOe- 
09 47-61.,. PR00019B 
11.36 1.720e-09 136- 
150 PR00019B 11.36 
3 . 880e-09 44-58 


832 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllB 13.08 3.438e- 
16 164-183 PROOOllD 
14.03 6.850e-16 164- 
183 PROOOllA 14.06 
8.364e-14 164-183 
FKUOUilC 24.25 5.415e- 
12 231-260 PROOOllD 

231 


834 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


12 232-246 


635 


PD0O3O6 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PDO0306A 10.26 4 . 000e- 
10 290-304 


636 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 216-230 


837 


DM00215 


PROLINE-RICH PROTEIN 3 . 


DM00215 19 .43 3 . 898e- 
09 7B-111 


839 


PD027B4 , 


PROTEIN NUCLEAR 
RIBONUCLEOPROTEIN . j 


PD02784B 26.46 B.302e- 
09 73-116 


840 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 5.091e- 
22 369-390 PR00700D 
12.47 5.765e-21 491- 
510 PR00700C 13.17 
4. 750e-14 449-467 
PR00700F 11.18 8.5O0e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








11 538-549 PRO0700E 
17.57 3.100e-10 522- 
538 i 


841 


ppnoing 


CATALYTIC DOMAIN 
SIGNATURE 


DDnm noil i o n c n * - 
13 134-153 


844 


PD02785 


PROTEIN RIBOSOMAL 60S 
L22 RNA-BINDING KPP 


PD02765B 14.43 l.OOOe- 
15.23 1.9l5e-28 8-57 


845 






n J_»U UHibL / . o 3 b . / Joe- 

09 203-230 


846 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.429e- 
10 15-24 


849 


BL0051B 


Zinc finger, C3HC4 type 
(RING finger) , proteins . 


BL00518 12.23 l.OOOe- 
08 340-349 


850 


PROO308 


TxPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 6.S06e- 
09 12-27 


851 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 7.000e- 
16 246-280 


852 


BL0O420 


Speract receptor repeat 
proteins domain 
proteins. 

r 


BL00420B 22.67 l.OOOe- 
40 723-778 BL00420B 
22.67 1.321e-38 933- 
988 BL00420B 22.67 
8.4S7e-28 482-537 
BL00420B 22.67 4.500e- 
27 587-642 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.205Q-26 163-218 
BL00420B 22.67 S.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.8O0e-15 830-885 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 808- 
819 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 5.119e-ll 1018- 
1029 BL00420C 11.90 
7.955e-10 567-578 


853 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420B 22.67 l.OOOe- 
40 756-811 BL00420B 
22.67 1.32le-38 966- 
1021 BL00420B 22.67 
8.457e-2B 482-537 
BL00420B 22.67 4.500e- 
27 620-675 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
z.Q00e-15 863-918 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 841- 
852 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.83le- 
11 141-152 BL00420C 
11.90 5.119e-ll 1051- 
1062 BL00420C 11.90 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








7.955e-10 567-578 


857 


PRO 03 8 8 


3 1 , 5 1 -CYCLIC NUCLEOTIDE 
CLASS II 

PHOSPHODIESTERASE 
SIGNATURE 


PR00388A 10.45 2.778e- 
09 64-83 


859 


BL00030 


Eukaryotic KNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 2.929e- 
13 37-56 BL00030B 
7.03 1.900e-ll 167-177 
BL00030A 14.39 2.000e- 
10 128-147 


861 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.250e- 
17 23-41 PR00988C 
13.64 8.714e-l6 107- 
123 PR00988P 12.23 
7.820e-15 198-212 
PR00988E 8.27 9.769e- 
12 176-188 PR0098BD 
5.95 8.250e-ll 163-174 
PR00988B 11.60 4.512e- 
10 60-72 


863 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215B 10.44 B.071e- | 
12 41-54 


864 


PR00775 


90 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00775E B.06 l.OOOe- 
24 i98-2?i ppnft77*;n 
3.52 1.837e-23 107-130 
PR00775D 8.91 4.484e- 
17 171-189 PR00775A 
9.90 8.342e-17 86-107 
PR00775C 10 68 9 379*»- 
17 153-171 PR00775G 
10.64 6.850e-15 267- 
286 PR00775F 12.76 
6.769e-14 249-267 ' 


866 


DM016 SB 


r 2 POLY-IG RECEPTOR. 


DM01688G 16.45 9.460e- 
09 89-121 


867 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.596e- 
29 14-53 




868 


BL012B7 


RNA 3 1 -terminal 
phosphate cyclase 
proteins . r 


BL012B7A 17.95 2.t»88e- 
26 16-48 




869 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.464e- 
10 304-337 




872 


BL00046 


Histone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 30-85 




874 


BL00188 


Biotin-reguiring enzymes 
attachment site 
proteins . 


BL00188 30.29 9 ; 031*8= 

32 665-711 




876 


BL00028 


Zinc finger, C2H2 type, i 
domain proteins. 


BL00028 16.07 7.6B6e- 
09 298-315 




877 


PDO2102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL. 


PD02102X ltj.74 4.176e- " 
10 97-141 




879 i 


BL01189 


Rioosomal protein S12e 
proteins. 


BL01189A 14.27 l.OOOe- 
40 35-71 BL01189B 
13.49 1.000e-40 71-125 




882 


BL00284 


Serpins proteins. 


BL00284C 28.56 6.400e- 
25 62-104 BL00284B 
17.99 6.182e-12 35-56 




889 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.375e- 
21 35-85 




896 


PR00391 


PHOSPHATIDYLINOSITOL 
TRANSFER PROTEIN 
SIGNATURE 


PR00391E 12.50 7.785e- 
15 211-231 PR00391B 
8.39 l.O00e-13 83-104 
PR00391D 12.21 9.328e- 
13 191-207 PR00391A 
7.83 5.390e-ll 16-36 


| 03 / 


PR00327 " 


ICE NUCLEATION PROTEIN 


PR00327C 6\37 5.247e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


" RESULTS* 






SIGNATURE 


09 313-328 " " 


898 


BL0O039 


DEAD- box subfamily ATP- 
dependent hell cases 
proteins. 


BL00039D 21.67 7.800e- I 
26 386-432 BL00039A 
18.44 6.674e-16 113- 
152 BL00039B 19.19 
1.947e-13 153-179 
BL00039C 15.63 9.460e- 
11 236-260 


901 


PD00066 


PROTEIN ZINC -FINGER 
METAL -BIND I . 


PD00066 13.92 8.200e- 
16 254-267 PD00066 
13.92 8.200e-16 282- 
295 PD00066 13.92 
B.200e-16 310-323 
PD00066 13.92 8.200e- 
16 366-379 PD00066 
13.92 8.200e-16 394- 
407 PD00066 13.92 
8.200e-14 33B-351 


902 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 9.321e- 
11 6-50 


903 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 9.160e- 
09 97-111 


904 


PRO 03 81 


KINESIN LIGHT CHAIN 
SIGNATURE 

.r 


PR00381E &. 75 6.58t>e- 
25 335-356 PR00381B 
18.17 2.667e-24 204- 
224 PR00381A 9.55 
2.800e-24 107-125 
PR00381C 12.48 4.522e- 
24 226-245 PR00381D 
13.94 1.0S4e-22 291- 
309 PR00381F 9.13 
3.288e-22 370-392 
rftuujoir y . u /.xoJLe— 
13 286-308 PR00381E 
8.75 4.066e-ll 2 I ?l-777 
PR0D381E 8.75 7.033e- 
11 293-314 PR00381E 
8.75 8.364e-10 377-398 
PR00381D 13.94 5.230e- 
09 333-351 PR00381C 
12.48 7.120e-09 310- 
329 


906 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4.54 B.557e- 
09 525-549 


907 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4.54 , 8.5-57e- 
09 513-537 


908 


BL0067B 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 9,308e-ll 
144-155 


910 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 2.800e- 
30 48-87 


912 


BL01104 


Ribosomal protein L13e 
proteins . 


BL01104C 15.14 6.000e- 
09 364-392 


922 


3L00678 


Trp-Asp (WD) repeat 
proteins proteins . 


BL0O678 9.67 3.842e-09 
500-511 


923 


PR00320 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320C 13.01 2.500e- 
09 323-338 PR00320C 
13.01 5.500e-09 187- 
202 


924 


PD02181 


PROTOCHLOHOPHYLLIDB 
REDUCTASE PHOTOS YNT . 


PD02181D 12.85 8.609e- 
09 36-64 


92* 


BL00019 


Act inin- type actin- 
binding domain proteins. 


BL00019C 14.66" 77453 e- 
25 108-144 BL00019B 
13.34 6.510C-11 61-84 
BL00019D 15.33 9.338e- 
11 205-235 BL00019A 
12.56 2.373e-10 34-45 


928 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 $.308e-ll 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


273-284 BL00678 9.67 " 
1.600e-10 314-325 
BL00678 9.67 7.600e-10 
360-371 BL00678 9.67 
8.579e-09 206-217 


929 


BL00518 


Zxnc finger , C3HC4 type 
(RING finger) , proteins. 


BL0051B 12.23 1.857e- 
10 137-146 


930 


BLO1085 


Ribulose-phosphate 3- 
epimerase family 
proteins . 


BL01085D 16.55 4.600e- 
24 134-16S BL01085B 

BL01085E 18.87 8.676e- 

21.81 2.038e-14 66-97 


931 


BLO1085 


Ribulose -phosphate 3- 
epimerase family 
■ proteins . 


24 152-183 BL01085B 

1U .13 3.00UC ZZ JU-J^ 

BL01085E 18.87 8.676e- 
20 190-220 BL01085C 

£.X.O± .c . UJoe- J.** DO- J / 


933 


PD00301 


PROTEIN REPEAT MtfSCLE 
CALCIUM-BI . 


PDdd36lA 10.24 £.400e- 

US ID 11— A /JL 


936 


P*00l68 


C2 domain proteins. 


PFO0168C 27.49 4.000e- 
12 336-362 


937 


BL00415 


Svnaoain<5 nrnf^lms 


dLiUUI-IdN a . 2.3 3.S19e- 

10 5-49 


94 0 


PR00862 


PROLYL OL IGOPE PT I DASE 
SERINE PROTEASE (S9A) 
SIGNATURE 


fKUUbfczu 16.17 4.086e- 
09 63-84 


94 5 


BL01230 


RNA methyl transferase 
tjroiA faniilv/ orotplifl 


BL01230B 11.62 2.373e- 


94 8 


BL00479 


Phorbol esters / 
diacvlalvcerol binding 

u^uwjA^j ajwg^ua milVUlIM 

domain proteins. 


BL00479B 12.57 7.429e- 

1ft CO-CD nTrtftjI'JGlV 

19.86 2.200e-13 26-49 


949 


BL0067B 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 1.474e-09 
100-111 


954 


PD01311 


PROTEIN OX IDOREDUCTASE 
NAD INTERGENTC RE. 


PD01311A 30.23 5.909e- 
10 66-111 


955 


PF00651 


BTB (also known as BR- 
C/TtJc) domain proteins . 


PF00651 15.00 3.250e- 
12 47—60 


95ta 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 r 3 .250e- 
12 47-60 


957 


BL0O379 


CDP- alcohol j 
phosphatidyltransf erases 
proteins . 


bLiUJJ /J* z4 . o4 i.oioe- 
15 111-148 


959 


BL0111S — 


GTP-binding nuclear 
protein ran proteins . 


BL01115A 10.22 1.884e- 
10 31-75 


960 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 3.43Be- 
14 110-1S4 


962 


BL00061 


Short -chain 

de hyd rog'ena see/re duct as e 
s family proteins. 


BL00061B 25.79 6.586e- 
13 198-236 


963 


PR00502 


MUTT DOMAIN SIGNATURE ' " 


PR00502A 15.06 8.200e- 
11 210-225 


966 


"PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


Pk0030fl& d aft *i mc 0 

09 55-70 


967 


DM01206* 


CORONAVIRUS NUCLBOCAPSID ' 
PROTEIN. 


DM01206B 10.69 1.286e-' 
12 104-124 DM01206B j 
10.69 5.299e-ll 23-43 
DM0120SB 10.69 8.274e- 
10 73-93 DM01206B 
10.69 3.962e-09 108- 
128 DM01206B 10.69 
5.67le-09 38-58 


969 


PP01008 


initiation factor 2 
subunlt. 


PFOIOOBB 25.59 4.724e- 
31 417-460 PF01O08C 
12.25 5.333e-18 506- 
526 PF01008A 20.14 
5.875e-lS 369-390 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


970 


BL01277 


Ribonucleaee PH 
proteins. 


BL01277C 10.18 7.648e- 
10 112-143 BL01277A 
17.39 9.806e-10 40-78 


975 


BL01159 


WW/rsp5/WWP domain 
proteins . 


BL01159 13.85 3.605e- 
12 130-145 BL01159 
13.85 4.122e-10 171- 
186 


977 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PP00791C 20.98 2.235e- 
09 55-94 


978 


BL01167 


Ribosomal protein LI 7 
proteins. 


BL01167B 20. 6$ B.258e- 
19 88-127 


979 


BL00478 


LIM domain proteins. 


DL00478B 14.79 9.357e- 
13 33-48 BL00478B 
14.79 7.250e-12 98-113 


980 


PR00312 


CALS EQUE STRIN SIGNATURE 


PR00312E 8.32 3.423e- 
3S 169-199 PR00312I 
15.78 5.286e-35 332- 
361 PR00312P 15.06 
5.865e-35 199-229 
PR00312H 13.31 8.313e- 

13.73 5.6B8e-34 363- 
392 PR0D312D 9 41 
2.636e-33 128-158 
PR00312C 15 14 9 B"*ej#»- 
33 92-122 PR00312B 
15.08 8.941e-33 62-92 
PR00312G 11.11 6.657e- 
32 230-258 PR00312A 
11.70 6.914e-27 35-59 


981 


PF00992 


Troponin . 


PF00992A 16.67 8.81Se- 
09 414-449 


982 


PRO 02 9 9 


ALPHA CRYSTALLIN 
SIGNATURE 


PR00299F 13.20 2.367e- 
09 127-149 j 


983 


BL01150 


Respiratory- chain NADH 
dehydrogenase 20 Kd 
subunit proteins. 


BL01150B 17.16 l.OOOe- 
40 156-202 BL01150A 
14.10 8.200e-39 100- 
138 


986 


BL00795 


Involucrin proteins . 


BL00795C 17.06 7.211e- 
14 4~4Q RT.nmocp 

17.06 1.778e-ll 1-46 
BL00795C 17 OS 1 4H7a- 
10 14-59 BL00795C 
17.06 7.802e-10 2-47 
BL00795C 17.06 8.640e- 
10 19-64 BL00795C 
17,06 7.400e-09 11-56 
BL00795C 17.06 7.800e- 
09 3-48 


987 


3L0093 9 


Ribosomal protein Lie 
proteins. 


BL009idE t 17.27 5\393e- 
09 810-840 


988 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 525-541 


989 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.55 6.538e- 
11" 497-513 


994 


BL00027 


'Horaeobox' domain 
proteinB. 


BL00027 2S.43 2.500e- 
25 146-189 


997 


BL01304 


ubiH/COQS monooxygenase 
family proteins. 


BL01304A 8.05 3.893e- 
11 65-79 


998 


DM01767 


5 TRANSMITTER DOMAIN. 


DM01767B 10.07 7.8^8e- 
09 22-39 


1000 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926C 16.07 1.7S0e- 
24 73-94 PR00926D 
10.53 3.250e-23 126- 
145 PR00926F 17.75 
6.211e-23 217-240 
PR00926E 11.70 6.625e- 
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SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 174-193 PR00926B 
16.07 2.125e-lB 24-39 
PR00926A 10.41 l.OOOe- 
15 13-25 PR00926F 
17.75 5 565e-09 120- 
143 


1005 


BL00406 


Ac tins proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
BL00406D 12.58 3.700e- 

8.44 7.375e-38 327-377 
BL00406A 9.95 3.348e- 
29 11-46 


1006 


BL00406 


Actins proteins. 


BL0O4O6B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
BL00406E 8.44 l.OOOe- 
35 248-298 BL00406A 
9.95 3.348e-29 11-46 


100.7 


FR00304 


TAILLESS COMPLEX 

tr\J JjX trBttr 1 .IDE* X 

! (CHAPERONS) SIGNATURE 


PR00304D 11.04 8.714e- 
22 384-407 PR00304C 
8.69 4.667e-20 98-118 
PR0O304B 11.60 7.577e- 
19 68-87 PR00304A 
9.20 3.3B2e-16 46-63 
rKUUJU4b 7.79 o.870e- 
13 418-431 


1009 


PD01066 


PROTEIN ZINT FTWfiPR 

ZINC- FINGER METAL- 
BINDING NU. 


trUUiObb 2.929e- 
32 9-48 


1011 


PD01066 


PROTEIN ZINC FINGER 

ZINC - FTNC5RR MRTAT— 
BINDING NU. 


PD01066 19.43 2.929e- 
3Z oo-JLU / 


1012 


BL0051B * 


Zinc finger, C3HC4 type 


BL00518 12.23 6.143e- 
xu o^t— /J 


1016 


"PD01168 


SYNTHETASE LIGA5E 
PROTEIN AIiANVTj 


PD01168H 12.08 l.OOOe- 


1018 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION . 

■> 


PD00930B 33.72 1.391e- 
32 261-3 02 PD0093 0A 
25.62 9.550e-22'157- 
i hi 


1022 


BL00175 


Phoaphogly cerate rautase 
proteins. 


BL00175A 15.42 5.179e- 

X&. D-CO DUVUl /DC 

23.75* 8.062e-10 79-111 


1025 


PR00305 


14-3-3 PROTEIN ZETA 
SIGNATURE 


10 158-185 


1026 


BL00353 


HMG1/2 proteins. 


BL00353B 11.47 2.436e- 
18 238-268 BL00353C 

A" " O J O. Otitis XX 400 

335 


1028 


BIiO0183 


Una qui tin -conjugating 
enzymes proteins . 


BL001B3 28.97 1.310e- 
33 43-91 


1033 


PF00580 


UvrD/REP helicase. 


PF00580A 13.37 4.720e- 
09 111-133 


1034 


PRO 04 13 


HALOACID 

DEHALOGENAS E /£ POXI DE 
HYDROLASE FAMILY 
SIGNATURE 


PR00413E 15.78 3.429e- 
09 154-171 


1037 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.6*576- 
09 5-44 


1038 1 " 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 4.259e- 
11 55-82 


1039 


BL00299 


Ubiquitin domain 
proteins. 


BL00299 28.84 9.036e- 
09 17-69 


104 0 


PR00970 


ARGININE ADP- 
RIBOSYLTRANSFERASE 


PR00970A 17.73 6.143ft- 
20 56-78 PRO097OD 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


9.96 2.154e-lB 154-171 
PR0097DF 12 30 1 finflp- 
16 224-241 PR00970G 

9.97 9.229e-15 242-258 
PR00970B 16.37 1.290e- 
13 86-105 PR00970C 
11.05 1.643e-ll 115- 
130 PR00970E 11.23 
9.820e-ll 202-218 


1042 


BL0067B 


Trp-Asp (WD) repeat 
proteins proteins. 


BL0067B 9 67 2 2O0.P-10 

243-254 


1043 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR0004BA 10 52 fi 7AGf»~ 
13 114-128 PR00048A 
10.52 1.000e-09 172- 
186 


1045 


BL00615 


C-type lectin domain 
proteins . 


BL00615A 16.68 1.720e- 
11 21S-21g RT.O.0£l^A 

12.25 1.857e-10 317- 
331 


1046 


BL01092 


Adenylate cyclases 
class— T nrnfceinq 


8L01092N 13.54 8.924e- 
10 3-40 


1047 


BL01216 


ATP-citrate lyase / 
succinyl-CoA ligases 
family proteinB . 


BL01216D 21.75 4.316e- 
28 314-344 BL01216A 
<jlj . ^ j. x.uuue-iu J7/-jljl« 


1049 


DM00031 


IMMUNOGLOBULIN V REGION. 


•DM00031B 15.41 7.610e- 
12 102-136 


1050 


BL01073 


Ribosomal protein L24e 
proteins. 


BL01073 24.30 l.OOOe- 
40 12-62 


"1054 


BtO"0571 


Aim das es proteins. 


BL00571 25.69 5.B75e- 
31 160-212 


1055 


BL0Q030 


cuftaryotic xux/i-zjinainy 
region RNP-1 proteins. 


BL0003 0A 14.39 5.235c- 
11 98-117 BL00030B 
7.D3 4.316e-09 137-147 


1058 


BL00223 


Annexins repeat proteins 
domain proteins . 


BL00223C 24.79 8.754e-" 
23 262-317 BL00223A 

1C CO Q /inn- « * /i n r\ 

BL00223A 15.59 5.557e- 
11 118-152 


1060 


BL00027 


• Horaeobox* domain 
proteins. 


BL00027 26.43 3.455e- 
35 158-201 


1064 


BL00455 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 6.211e- 
13 280-296 


1065 




SIGNATURE 


PR00019A 11.19 2.000e- 
09 115-129 PR00019B 
11 .jo 3.8o0e-D9 87-101 


1066 


PK00326 


GTP1/0BG GTP-B INDING 
PROTEIN FAMILY SIGNATURE 


PR00326A B.75 4.600e- 

AO ^KUUJ^bL 

9.79 1.2906-14' 200-216 
PR00326B 16.74 B.548e- 
14 172-191 PR00326D 
19.09 1.257e-13 217- 
236 


1071 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR . 


PD02870B 18.83 8.518e- 
11 164-197 


1072 


PF00856 


SET domain proteins. 


PP00Q5GA 2fi 14 5 976>- 
09 350-3B7 


1075 


BL01009 


Extracellular proteins ; 
SCP/Tpx- 1/Ag5/PR- l/Scl 
proteins. 


BL01009D 14.19 4.300e- 

13.75 6.586e-13 57-75 
BL01O09E 13.50 1.439e- 
11 159-175 


1077 


PRO 072 4 


CARBOX YP E PTIDAS E C 
SERINE PROTEASE (S10) 
FAMILY SIGNATURE 


PR00724A 10.91 l.OOOe- 
08 366-379 


1078 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 l.OOOe- 
12 170-195 BL00215A 
15.82 7.529e-10 79-104 


1079 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 4.3l6e-09 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


298-309 


1061 


BL0O326 


Tropomyosins proteins. 


BL00326A 14.01 7.39fie- 
10 23-57 


1094 


BI.00460 


Glutathione peroxidases 
selenocysteine proteins. 


BL00460A 28.67 3.204e- 
16 57-92 BL00460B 
9.73 6.400e-13 100-118 
BL00460D 16.89 9.143e- 
12 162-182 BL00460C 
14.35 5.500e-09 133- 
156 


1095 


PD02811 1 


" PROTEIN PEPTIDE 

REDUCTASE MG44 8 PILB 
FIMBRIA TRAN. 


PD02811A 20.^7 3.017e- 
22 67-105 PD02811B 
17.07 2.263e-21 118- 
151 PD02811C 13.25 
5.696e-13 154-167 


1096 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 60-98 PD02811B 
17.07 2.263e-21 111- 
144 PD02811C 13.25 
5.696e-13 147-16*0 


1097 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 6.143e- 
09 200-216 


1105 


PF00881 


Nitroreductase family. 


PF00881A 27.15 9.229e^~~ 
13 111-147 


1109 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


10 15-37 PR00449E 
13.50 1.857e-09 185- 
208 PR00449D 10.79 
8.364e-09 131-145 


1115 


PRO 04 05 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5 . 737e- 
20 42-60 PRO0405A 
17.71 2.703e-17 23-43 
PR00405C 19.41 6.902e- 
10 53-85 


111G 


BL00355 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2.£28e-25 
20-51 


1117 


BL00355 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2.S28e-25 
20-51 


1120 


BLOO107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 4.8S7e- 
10 290-306 


1123 


PBD0412 


EPOXIDE HYDROLASE 
SIGNATURE 


PR00412* 18.761 9.526e- 
12 301-324 


1125 


PR001B6 


HEMERYTHRIN SIGNATURE 


PR001B6A 13.62 2.600e- 
09 87-101 


1129 


BL00170 


Cyclophilin- type 
pept idyl -prolyl cis- 
trans isomerase 
signatur. 


BL0017OC 18.49 3.077e- 
33 84-129 BL00170B 
20.97 6.838e-25 37-77 
BLO017OA 17.08 3.455e- 
15 10-37 


1131 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 5.304e- 
15 29-46 BL00636B 
15.11 1.360e-14 59-80 


1132 


BL666"78 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1133 


BL0067B 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 ' 
29-40 


1136 


BL00990 ! 


Clathrin adaptor 
complexes medium chain 
proteins . 


BL00990C 18.78 4.176e- 
38 235-269 BL00990A 
21.44 4.316e-36 94-132 
BL00990B 20.15 2.125e- 
27 157-187 BL00990D 
16.13 5.320e-lB 403- 
422 


1137 


PRO 03 14 


CLATHRIN COAT ASSEMBLY 
PROTEIN SIGNATURE 


PR00314B 15.68 B.OOOe- 
34 100-128 PR00314D 
9.66 3.531e-33 233-261 
PR00314C 16.05 8.909e- 
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SEQ ID NO: 



ACCESSION 
NO. 



DESCRIPTION 



RESULTS* 



32 159-188 PR00314A 
14.53 1.28le-22 13-34 



1139 



BL01115 



GTP- binding nuclear 
protein ran proteins. 
Protein Kinases ATP- 



BL01115A 10.22 6.364e- 
13 13-57 



1141 



BL00107 



binding region proteins. 



BL00107A 18.39 4.00Ge- 
19 451-4B2 BL00107B 
13.31 3.077e-l2 519- 
535 



1148 



PR00685 



TRANSCRIPTION INITIATION 
FACTOR IIB SIGNATURE 



PR00685A 13.62 4.676e- 
09 21-42 



1155 



PD01652 



RECEPTOR CELL NK 
GLYCOPROTEIN IMMUNOGLOB. 



PD01652B 8.50 9.396e- 
10 522-574 PD01652B 
8.50 9.463e-10 740-792 



1157 



PD02B94 



HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 



PD02894A 21.96 7.873e- 
28 81-127 PD02894B 
13.93 l.lB8e-27 178- 
211 



1159 



1161 



1162 



1163 



1167 



1177 



1178 



1180 



TTbT" 



1184 



BL00623 



PD01937 



PD01937 



PR00624 



BL00226 



BL01032 
PR00320 



PR00454 
BL00291 



BL00720 



GMC oxidoreductases 
proteins . 



DNA PROTEIN POLYMERASE 
BNDONUCLEAS B DNA- . 
DNA PROTEtN POLYMERASE 
ENDONUCLEASE DNA- . 
HI STONE H5 SIGNATURE 



BL00623E 15.00 3.531e- 
20 391-414 BL00623C 
10.86 4.240e-20 155- 
176 



PD01937A 6.68 3.475e- ' 
09 330-341 



PD01937A t> . £b 3.4^5e- 
09 221-232 



PR00624D 11.94 7.455e- 
10 214-239 PR00624D 
11.94 1.961e-09 312- 
337 



intermediate filaments 
proteins . 



Protein phosphatase 2C 
proteins. 



BL00226B 23.86 7.384e- 
09 302-350 



G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 



BL01032G 8.33 1.422e- 
10 34-48 



PR00320A 16.74 1.794e- 
10 205-220 PR00320C 
13.01 7.840e-10 205- 
220 PR00320B 12.19 . 
B.457e-10 35-50 
PR00320A 16.74 7.146e- 
09 35-50 PR00320B 
12.19 9.100e-09 79-94 



ETS DOMAIN SIGNATURE 



Prion protein. 



PR00454D 10.89 4.150e- 
19 765-784 



Guanine-nucleotide 
dissociation stimulators 
CDC25 family sign. 
Mitochondrial energy 



BL00291A 4.49 B.962e- 

11 152-187 

BL00720B 1^.57 4.103e- 
18 1089-1113 



1187 
1188 



BL00215 



BL00983 
BL00878 



transfer proteins. 



Ly-6 / u-PAR domain 
proteins . 



BL0021SA 15.82 4.553e- 
13 204-229 BL00215A 
15.82 1.429e-12 11-36 
BL00215A 15.62 9.809e- 
11 104-129 



BL009B3C 12. 6$ 2.76le- 
10 77-93 



1191 



PD02939 



Orn/DAP/Arg 

decarboxylases family 2 
pyridoxal-P attachment 
si. 



BL00878B 10.95 6.000e- 
16 189-204 BL00878C 
17.74 8.435e-15 225- 
245 BL00878F 19.67 
3.625e-13 379-402 
BL0087BD 16.55 1.621e- 
09 270-289 



PROTEIN GLUTATHIONE 
SYNTHETASE SY. 



PD02939B 10.10 2.723e- 
12 203-220 PD02939C 
20.01 l.O00e-ll 224- 
252 



1193 



PRO 03 4 5 



STATHMIN FAMILY 
SIGNATURE 



PR00345B 7.12 2.800e- 
28 72-101 PR00345E 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








8.54 7.652e-28 149-174 
PR00345C 4.54 9.100e- 
28 101-125 PR00345D 
10.97 1.964e-24 125- 
149 PR00345A 13.46 
5.645e-16 43-62 


1194 


PRO Old, 5 

»«» Villa) 


^ATHMTld PAMTV V" 

SIGNATURE 


PR00345B 7.12 2.800e- 
28 106-137 PR00345E 
8.54 7.652e-28 185-210 
PR00345C 4.54 9.100e- 
28 137-161 PR00345D 
10.97 1.954e-24 161- 
185 PR00345A 13.46 
5.645e-16 79-9B 


1195 


PF00995 


Seel family. 


PF00955B 17.37 l,l20e- 
13 224-264 


1196 


BL00992 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL009B2A 18.41 6\73Be- 
11 15-47 


"1197 


BL01298 


Di hydrociipi col ina t e 
reductase proteins. 


BL01298A 13.90 5.959c- 
09 51-73 


1203 


BL00061 


Short -chain 

dehydrogenases /reductase 
s family proteins. 


BL00061B 25.79 l.OOOe- 
14 152-190 


1204 


PR00118 


BETA- LACTAMASES CLASS A 
SIGNATURE 


PR00118F 16.42 9.3 86e- 
09 213-229 


l one 


BL01183 


ubiE/COQ5 

methyltransferase family 
proteins. 


BL01183B 21.31 i.4 29e-— 
37 184-229 BL01183D 
27.71 8.535e-27 264- 
307 BL011B3A 13.25 
3.250e-23 51-73 
BL01183C 10.77 S.295e- 
09 246-258 


1208 


BL00979 


G-protein coupled 
receptors family 3 
proteins. 


BL00979L 20.63 2.485e- 
09 105-146 


1209 


r PFC0023 


Ank repeat proteins. 


PF00023A 16. Oi 4.857e- 
11 49-65 PFO0023B 
14.20 l.B18e-09 45-55 


1212 


PR00048 


C2H2-TYPE ZINC FINGER 

SIGNATURE 


PR00048A 10.52 7.750e- 
14 227-241 PR00048A 
10. r 52 4.316e-ll 199- 
213 


1213 


PR00450 


RECOVERIN FAMILY 
SIGNATURE 


PR00450C 12.22 1.720e- 
10 20-42 PR00450C 
12,22 3.506e-09 56-78 
PR00450D 16.58 6.769e- 
09 44-64 


1216 


BL00412 


Neuromoflulin (GAP-43) 
proteina. 


BL00412D 16.54 S.598e- 
10 179-230 


1219 


PR004S6i 


RXBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.0£ S.348e- 
11 249-264 


1222 


PD00 066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 7.231e~ 
15 295-308 PD00066 
13.92 7.231e-15 406- 
419 PD00066 13.92 
2.286e-12 378-391 
PD00066 13.92 7.857e- 
12 434-447 PD00066 
13.92 3.348e-ll 350- 
363 


1223 


BL50058 


G-protein gamma svUounit \ 
profile. 


BL5005B 27.23 l.OOOe- ' 
40 13-61 


1226" • 


BL00412 


Neuromodulin (GAP-43) 
proteins. 


BL00412D 16.54 8.439e- 
09 279-330 


1227 


BL00437 


Catalase proximal heme- 
ligand proteins. 


BL00437A 18.82 l.OOOe- 
40 49-101 BL00437B 
16.28 l.OQ0e-40 114- 
168 BL00437C 21.86 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








l.OOOe-40 190-239 
BL00437D 25.72 l.OOOe- 
40 248-301 BL00437E 
23.95 1.000e-40 327- 
379 


1230 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 8.297e- 
10 5-60 


1231 


PR00735 


GLYCOSYL HYDROLASE 
FAMILY 8 SIGNATURE 


PR00735A 11.19 6.857e- 
09 391-405 


1232 


PRO 04 97 


NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 


PR00497A 6.92 5.553e- 
10 158-176 


1233 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 


PR00497A £.92 S.SS^e- 
10 158-176 


1235 


BL00866 


Carbamoyl - phosphate 
synthase subdomain 
proteins. 


BL00866B 36.29 2.776e- 
09 75-121 


1237 


BL00027 


• Homeobox 1 domain 
proteins. 


BL00027 26.43 l,818e- 
21 36-79 


1243 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 1.184e- 
11 10-25 


1246 


PD01168 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PD01168L 9.47 2.837e- 

10 31-46 PD013£R1. 

9.47 4.490e-10 174-189 
PD01168L 9.47 7.612e- 
10 183-198 


1249 


BL00018 


EF-hand calcium- binding 
domain proteins. 


BL00018 7.41 2.800e-10 
183-196 


1254 


"BL001B3 


Obiqui tin- conjugating 
enzymes proteins . 


BL00183 28.97 2.440e- 
36 96-144 




BL01115 


GTP -binding nuclear 
protein ran proteins. 


BLOillSA 10.22 5.670e- 
11 8-52 


1256 


BL00373 


Phosphoribosylglycinamid 
e formyl transferase 
proteins. 


BL00373C 10. 3$ 3 .348e- 
12 143-156 


1258- ■ 


PR00011 


TYPE ill EG F- LIKE 

SIGNATURE 


PR00011B 13.08"3.217e- 
10 174-193 


1259 


BL00518 


Zinc finger, C3IIC4 type 
(RING finger) , proteins . 


BL00518 12.23 8.2B6e- 
10 31-40 ! 


1261 


PRO 0070 


SIGNATURE 

' r 


ppnnmnn i i i nnr\e%." " 
fRUUU /uu XI. bJ l.uoue- 

15 112-127 PR00 T O70C 

13.09 9.500e-15 51-63 

PR00070A IP 99 5 enrt A _ 

12 16-27 


1262 


BL00462 


Gamma - 

glutamyl transpeptidase 
proteins. 


24 140-183 BL00462B 
17.88 5.500e-20 230- 
267 BL00462C 27.41 
2.023e-ll 292-347 


1263 


BL0003B 


Myc-typc, 'helix- loop- 
helix 1 dimerization 
domain proteins. 


BL00038B 16.97 9.455e- 
11 52-83 


1264 


BL01115 


GTP -binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e- 
11 17-61 


1266 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00837C 17.21 2.714e- 
18 165-182 PRO0837A 
14.77 4.512e-12 86-105 
PR00837D 11.12 7.577e- 
12 201-215 


1269 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 9.308e- 
22 40-63 PR00449E 
13.50 1.000e-16 137- 
160 PR00449D 10.79 
3 .520e-ll 102-116 


1270 


BL0O276 


channel forming colicins 
proteins . 


BL00276A 8.87 l.SOOe- 
09 17-29 


"1275 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327C 15.47 9.769e- 
09 228-243 


1276 


PRO 04 12 


EPOXIDE HYDROLASE 


PR00412B 12.59 7.894e- 
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•SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


12 119-135 PR00412C 
11.30 1.857e-ll 165- 
179 PR00412A 13.23 
3.400e-ll 100-119 


1277 


PF00756 


Putative esterase. 


PF0D756C 14.12 9.538e- 
10 127-157 


1279 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


12B0 


BL01220 


Pho spha t idyl e tha n ol amine 
-binding protein family 
proteins. 


BL01220C 14.75 9.348e- 
15 248-276 


12B5 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 2.286e- 
10 33-42 


1287 


PF00791 


Domain present in ZO-1 
and UncS-like netrin 
receptors . 


PF00791B 28.49 7.1B2e- 
11 288-343 


1292 


'"PR00802 


SKRUM ALBUMIN FAMILY 
SIGNATURE 


PRO08O2B 16.51 1.610e- 
10 81-105 


1297 


PR00716 


M- PHASE INDUCER 
PHOSPHATASE SIGNATURE 


PR00716C 17.65 5.696e- 
09 23-44 


1298 


BL0047B 


LIM. domain proteins. 


BL00478B 14.79 6.478e- 
14 26B-283 


1301 


BL00127 


family proteins. 


»1iOD127C 31.49 3.571e- 
28 82-126 BL00127B 
^b.b/ o.600e-28 23-68 


1302 


PR00637 


TYPE 3 BOMBESIN RECEPTOR 
SIGNATURE 


PR00637E 11.27 $.250e = 

no Ton i r> c 


1307 


BL0021S 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 5.500e- 
17 13-38 BL00215A 

1C □ "J 1 HAAn -I i- -i <-i /» 

B2 1.00ue-16 22G- 
251 BL00215A 15.82 

^ .030c"xJ JLU / — JLJ« 


1308 


PR00898 


VASOPRESSIN V2 RECEPTOR 
SIGNATURE 


PR00898H 11.34 4.682e- 
no cco-ci*) 


1309 


PD003 01 


PROTEIN REPEAL WJSW& ' 
CALCIUM-BI . 


PD00301B 5.49 2~.731e- 
09 390-401 


1310 


BL00983 


T,v- 6 / u-PAR domain 
proteins . 


JsJjUU7oJU 12.69 9.654e- 
13 73-89 BL00983B 
8.19 3.132e-09 12-22 


"1313 


BL00194 - 


Thioredoxin family 
proteins . 


BL00194 12.16 1.900e- 
11 15-28 


"1314 


BL00594 


Arnina hi f Am inrt sa^Hq 

permeases proteins. 


oLiU0b94A 16.75 8.9©9e- 
10 53-97 


1316 


BL00134 


Serine Tirotpa^PH 
trypsin family, 
histidine proteins. 


oJjUUIJ^A 11.96 9.325e- 
13 128-145 


1320 


BL00783 


Ribosomal protein L13 
proteins. 


ojuvu / ojl . H J b.bo9e- 
24 07-117 BL00783A 
14 55 1 600^-19 
BL00783B 12.76 3.500e- 
12 74-86 


1327 


PF00514 


Armadillo/beta- catenin- 
like repeat proteins. 


PF00514A 31.30 7.268e- 
11 82-120 


1329 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 6.294e- 
11 129-148 BTjODO^flR 
7.03 4.789e-09 168-178 


1331 


PR00497 


NEUTROPHIL CYT0S0L 
FACTOR P40 SIGNATURE 


PR00497A 6 92 7 22 9f»- 
09 25-43 


1332 


PROOl^l 


NICKEL - DEPENDENT 
HYDROGENASE /B-TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e- 
09 317-337 


1333 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.769e- 
33 10-49 


"1336 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


^007000 12.47 2.200e- 
09 262-281 


1337 


PR007O0 


PROTEIN TYROSINE 


fR00700D 12.47 2.200e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PHOSPHATASE SIGNATURE 


09 211-230 


1340 


PR00860 


VERTEBRATE* 
SIGNATURE 


PR00860A 5.46 5.034e- 
13 5-18 


1341 


BL00893 




BL00893 18.99 6.750e- 
16 46-71 


1343 




bir repeat proteins. 


BL01282B 30.49 S.974e- 
21 383-422 


1344 


DM00099 


4 Jew A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE. 


DM00099B 14.73 8.313e- " 
09 417-427 


1345 


" BLO0923 


Aspartate and glutamate 
racemases proteins. 


BL00923B 11.41 *.935e- 
10 135-146 


"1348 


PF006-S1 


"btb (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 7.231e- " 
13 44-57 


1350 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14 .36 3.57le- 
32 416-445 PR00193C 
12.60 6.318e-31 179- 
207 PR00193B 11.69 
3.571e-24 133-159 
PR00193B 19.47 9.069e- 
22 470-499 PR0O193A 
15.41 1.783e-20 77-97 


1352 


PR00447 


NATURAL RESISTANCE- 

ASSOCIATED MACROPHAGE 
PROTEIN SIGNATURE 


PR00447E 9.73 1.5S4e- 
15 299-319 PR0O447D 
13.54 3.408e-15 200- 
224 PR00447A 12.73 
6.357e-ll 97-124 
PR0044 7G 6.69 9.877e- 
10 353-373 


1353 


BLO0303 


S-lOO/lcaBP type calcium 
binding protein. 


BL00303A 21.77 £.667e- ' 
26 45-82 BL00303B 
26.15 1.000e-24 93-130 


1355 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins . 


BL0003 9D 21.67 5.950e- 
29 375-421 BL00039A 
18.44 7.136e-29 99-138 
BL00039C 15.63 4.000e- 
18 225-249 BL00039B 
19.19 3.182e-14 141- 
167 


1357 


PF006"lS 


ReguJ>ator of G protein 
signalling domain 
proteins. 


PF00615B 16.25 2.216e- "" 
12 84-101 PF00615C 
10.06 8.412e-12 162- 
176 


1360 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PDOIO^S" 19.43 9.234e- 
29 10-49 


1361 


PRO 6925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925A 5.47 5.091e- 
18 14-29 PR00925B 
3.73 6.143e-14 29-42 
PR00925C 5.57 4.789e- 
12 53-64 PR00925D 
6.56 1.857e-10 76-87 


1362 


BL01272 


GlucoJcinase regulatory 
protein family proteins. 


BL01272B 19.61 6.870e- 
30 136-171 BL01272C 
11.68 3.314e-25 249- 
274 BL01272A 6.49 
1.231e-18 99-117 


1363 


BL01272 


Giucokinase regulatory 
protein family proteins. 


BL01272B 19.61 "6.8706- 

11.66 3.314e-25 226- 
251 BL01272A 6.49 
1.23le-18 76-94 


1364 


DM0O179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 S.304e- 
09 167-177 


1368 
1370 


PRO Ol£ 9 
FR0O988 


POTASSIUM CHANNEL 
SIGNATURE 

URIDINE KINASE SIGNATURE 


PR0Q169A 16.77 1.592e- 
09 76-96 

PR00988A 6.39 1.794e- 
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SEQ lb NO:" 


ACCESSION 
NO. 














137X 


BLO0242 


integrins alpha chain 
proteins . 


BL00242B 8.13 8.615e- 


1372 


PR00625 


DNAJ PROTEIN PAMILY 
SIGNATURE 


PR00625B 13.48 7.353e- 
19 46-67 PR00625A 


1373 


BL00434 


HSF-type DNA-binding 
domain proteins . 


BL00434C 23.85 3.770e- 


1374 


PRO0962 


LETHAL (2) GIANT LARVAE 
PROTEIN qT/rvjaTTTPF 


PR00952C 8.00 6.337e- 

AO CAE r ^ 

09 505-526 


1375 


PD02475 


MUCIN EPITHELIAL TUMOR- 


PD02475A 23.18 8.552e- 
10 1111-1150 


1376 


PD01066 


PROTEIN Zlrtd FINGER 

£iJLliL"f InuiiK riCilAlj — 

BINDING NU. 


PD01066 19.43 9.571e- 
32 24-63 


1380 


BL00194 


lTiioreaoxin ramiiy 
proteins. 


BL00194 12.16 U.333e- 
12 4B-61 


1381 


unuij / \j 


0 KW 4K632.12 YJDR313C 
ENDOSOMAL III. 


DM01970B 8.60 1.458e- 
15 1123-1136 


1383 


BTiO0f!7fl ' 
su w w o / □ 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
243-254 


1384 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 7.600e-10 
271-282 


1385 


BL00303 


S-100/ICaBP type calcium 
binding protein. 


BL00303B 26.15 £.2G3e- 
10 95-132 


1386 


uijy xiou 


itinesin light chain 
repeat proteins. 


BL01160B 19.54 5.042e- 
09 1574-1628 


"1387 


oliUUblo 


Zinc finger, C3HC4 type 
(RING finger) , proteins. 


BL06518 12.23 l.OOOe- 
11 52-61 


1389 


PD01066 


PROTEIN ZISC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 3.600e- " 
30 10-49 


1390 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 3.5"i2e- 
31 32-71 


"1392 


"PRO 03 08 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR0030BC 3.83 9.723e- 
10 127-137 


1393 

r 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.625e^ 
25 88-110 PR00380D 
9.93 2.406e-20 304-326 
PR00380B 12.64 4.414e- 
16 208-226 PR00380C 
13:18 6.538e-16 243- 
262 


1394 


PD0006S 


PROTEIN ZINC- FINGER 
METAL- BINDI. 


PD00066 13.92 3.400e- 
14 462-475 PD00066 
13 .92 8.800e-14 348- 
361 PD00066 13.92 
9.57le-12 405-418 • 
PD00066 13.92 6.0B7e- 
11 490-503 PD00066 
13.92 B.043e-ll 320- 
333 


1390 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.786c- 
32 10-49 


1400 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 7.038e- 
09 270-290 


1406 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930A 25 62 7 324**- "" 
15 363-3B9 


1407 


BL0003 0 


Eukaryotic RNA- binding 
region RNp-i proteins. 


BL00030A 14.39 7.500e- 
10 457-476 


1468 ' " 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9.550e- 
11 179-193 PR00019A 
11.19 8.826e-10 228- 
242 PR00019B 11.36 
1.360e-09 199-213 
PR00019B 11.36 4.960e- 
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SEQ ID N07 


| ACCESSION 
1 NO. 


DESCRIPTION 


RESULTS* " ' 








09 176-190 


1409 


PR00510 


NEBULIN SIGNATURE 


PR00510A 9.09 4.1S0e- 
12 182-202 PR00510B 
12.96 8.767e-12 210- 
230 PR00510F 9.88 
8.172e-10 58-75 
PR00510D 9.21 2.367e- 
09 251-267 


1410 


' "PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.696e- 
09 31-44 


1412 


BL00358 


Rlbosoraal protein LS 
proteins. 


**UW W Q O £ «p. * 4 O 1 * Uvv6 N 

40 57-103 BL00358C 
13.75 6.087e-14 122- 

ijo tjjuuujjoiy Al.^o 
5.500e-13 143-158 

11 33-44 


1414 


BL00282 


Kazal serine protease 
inhibitors family 
proteins . 


HTifin^OO 1 C flfl ^ -j "o ' " 

10 511-534 


14 IS 


BL00023 


Type II tibronectin " 
collagen-binding domain 
proteins. 


BL00023 24.31 4.300e- 
29 40-77 


1417 


PR006B1 


RIBOSOMAL PROTEIN SI 
SIGNATURE 


PRd06-81G 12. S4 2.149e- 
09 38-60 


"1418 


| DM00973 


3 Xw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE . 


DM00973A 21.17 1.462e- 
09 171-208 


1419 


ri>R00319 


BETA G- PROTEIN 
{TRANSDUCING SIGNATURE 


PR00319B 11.47 1 . 571e- 
09 428-443 


1420 


PD01941 


TRANSMFMnPfcWT? 

COTRANS PORTER SYMP. 


PD01941A 14.81 l.OOOe- 
40 142-196 PD01941B 
15.02 7.049e-30 400- 
447 PD01941E 15.92 
2.475e-20 817-864 
PD01941C 19.96 3.1l8e- 
19 488-543 PD01941D 
27.18 9.614e-18 641- 
690 PD01941P 28. S2 
S.382e-15 1038-1093 


1422 


PRO 02 05 


CADHERIN SIGNATURE 


PR00205B 11.39 8.043e- 
12 199-217 


1423 j 


PR0O209 


ALPHA/ BETA GLIADIN 
FAMILY SIGNATURE 


PR00209B 4.88 6.318e- 
jlx ±u u y- xu 2 o 


1424 j 


BL£O0O2 


Src homology 3 (SH3 ) 
domain proteins profile. 


BL50002A 14.19 8.200e- 
xh jo /-job ±JLir>U002A 
14.19 9.250e-12 298- 
317 BL50002A 14.19 
4.462e-ll 208-227 
BL50002B 15 1ft i nnn^ 
09 244-258 


1425 


PFOD628 


PHD- ringer. 


PF00628 15.84 3.045e- " 
12 330-345 


142 6 ~ " I 


PF00628 


PHD-finger. 


PF00628 15.84 3.045e- 
12 377-392 


1427 j 


PR00405 


HIV REV INTKRACTING 
PROTEIN SIGNATURE 


PR06405B 11.83 S.114e- 
16 281-299 PR00405A 
17.71 4.306e-14 262- 
282 


1428 f 


BL00039 


D SAD -box subfamily ATP- 
dependent heli cases 
proteins . 


BL00039D 21.67 5.219e- 
34 147-193 


1429 

143 0" h 


PR00320 ■ 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 8.920e- 
10 577-592 




PR00378 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR00378D 16.86 7.563e- 
12 295-314 PR00378B 
13.80 8.650B-10 166- 
186 


1431 ] 


s>R00928 


JJ RAVES DISEASE 'CARRIER 


PR00928B 13.53 3.769e- " 
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SEQ ID NO: 


ACCESSION 
NO. 




Kc.6U.Li l b * 






PROTEIN SIGNATURE 


10 103-124 


1433 


BL01113 


Clq domain proteins. 


BL01113B 18.26 7.049e- 
15 14-50 BL01113C 
13.18 7.000e-12 82-102 


1434 


PR00319 


uiji-rt U'ruuitilii 

' ( TRANS DUC IN) SIGNATURE 


PR00319B 11.47 7.983e- 
10 135-150 


1436 


BLO 00.10 


cuA.dryQun. tufn - joxnamg 
region RNP-1 proteins. 


BL00030A 14.39 l.OOOe- 
12 84-103 


1438 




Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.500e- 
09 250-268 BL00290A 
20.89 4.000e-09 1B8- 
211 


1440 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 38-52 


1441 


PRO 08 06 


VINCUIiIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 BB-102 


1444 


BL00422 


Granins proteins. 


BL00422D 19.48 l.OOOe- 
08 114-138 


1445 


PD01941 


PHOSPHQRYIASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 73-123 PD01B41B 
14.35 1.000e-40 144- 
185 PD01841D 17.87 
l.OOOe-40 206-258 
PD01B41F 13.36 l.OOOe- 
40 296-345 PD01B41G 
24.26 l.OOOe-40 349- 
403 PD01841I 23.00 
1.0D0e-40 494-536 
PD01841J 14.94 l.OOOe- 
40 895-932 PD01841L 
18.42 1.000e-40 1083- 
1125 PD01B41E 18.60 
9.719e-38 258-296 
PD01841K 14.81 l.OOOe- 
35 1041-1071 PD01841H 
21.30 3.189e-3l 435- 
472 PD01841C 13.78 
l.OOOe-25 185-206 
PD01841M 10.82 1.250e- 
20 1175-1194 


1446 


*F00B16 j H-NS hiS tone -family. 


PF00816B 13.84 8.875e- 
09 190-220 




PR00048 


C2H2 -TYPE ZINC FINGER 
SIGNATURE 


PR00D4BA 10.52 2.080e- 
09 402-416 


1448 


DM00315 


072 RIBONUCLEASE 
INHIBITOR. 


DM00315D 18.40 7.393e- 
09 23-67 


1451 


BL00030 


Eukaryot;ic RNA-binding 
region RNP-1 proteins. 


BL00030B 7.03 2.800e- 
10 94-104 


1454 


DM01688 


2 POIiY-IG RECEPTOR. 


DM0168SD 13.44 7.146e- 
09 3B2-405 


1455 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 2.929e- 
22 4-59 


1457 


BL00927 


Trenalase proteins. 


BL00927C 10.83 8.085e- 
09 42-53 


1460 


BL00545 


Aldose l-epimerase 
proteins . 


BL00S45C 11.28 7.353e- 
17 169-182 BL00545A 
10.20 2.071e-15 73-89 
BL00545B 13.10 3 . 942e- 
09 140-153 


1466 


PR00097 


ANTHRANILATE SYNTHASE 
COMPONENT II SIGNATURE 


PR00097C 9.42 9.0*9e- 
09 233-245 


1472 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins . 


BL01129E 13.25 5.250e- 
22 170-195 BL01129C 
25.56 9.526e-18 63-106 


1473 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.821e- 
09 2114-2145 


1475 


PF00686 


Starch binding domain 
proteins. 


PF006B6A 13.45 9.100e- 
09 267-277 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1477 


" PF00566 


Probable rabGAP domain 
proteins . 


PF00566A 12.64 7.333e- 
10 466-476 


1478 


BXiO0O30 


Eukaryotic RNA- binding 
region RNP-l proteins. 


BLO0030B 7.03 9.400e- 
10 43-53 


1479 


DM00406 


GLIADIN . 


DM00406 7.73 8.541e-10 
292-305 


1480 


BL0 0290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.385e- 
15 69- B7 BL00290A 
'20.89 5.091e-ll 12-35 


1481 


PRO 0150 


PHOS PHOENOLP YRUVATE 
CARBOXYLASE SIGNATURE 


PR00150F 10.45 9.039e- 
09 21-51 


1482 


PF00780 


Domain found in NIX1- 
like kinases, mouse 
citron and yeast ROM. 


PF007B0I 14.69 4.825e- 
09 107-137 


1483 


BL01160 


Kinesin light chain 
repeat proteins.. 


09 108-162 


1485 


PD01066 


PROTEIN ZINC PINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.909e- 
2S 17-56 


1486 


BL0 0107 


binding region proteins. 


Duvuiv ta 13.31 1.529e- 
09 34-50 


1488 


BL0 0039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 9.586e-" 
10 116-162 


1490 


BL0 0166 


Bnoyl-CoA 

iiyui.ak.aoe/ isonicraSc 

proteins . 


BL00166D 22.87 2.607e^ 
24 190-226 BL00166C 
18.93 5.500e-14 140- 
167 BL00166B 16.92 
9.357e-ll 93-115 


1491 


BL00452 


proteins. 


oLiUWizniU 28. 59 3 . 700e- 
31 63-106 DL00452E 
Ax. y4 J.U45e-13 115- 
131 


1492 


PRO 0019 


LEUCINE-RICH REPEAT 
SIGNATURE 


DPfirtrti q& 11 1 a ■» ceio 
rRUuuxjft J.J..J.3 J .oo7c- 

09 532-546 


"149? 


BL00107 


Protein kinases ATP- ! 
binding region proteins. 


HT.flnimn 1*a 11 1 nnn a 
11 384-400 BL00107A 
18.39 5.345e-ll 322- 
353 


1500 


PF00876 


Ogre family. 


PF00876E 7.99 1.947e- 
10 107-117 


1502 


BL0OO27 


' Homeobox ' domain 
proteins . 


BL00027 26.43 4.789e- 
24 112-155 


1503 


BL0 0027 


• Homeobox ' domain 
proteins. 


BLO0O27 26.43 4.789e- 
24 112-155 


1505 


BL01177 


Anaphylatoxin domain 
proteins . 


BL01177E 20.64 S.BOOe- 
24 448-475 BL01177C 
JL/.jy S. 3336-19 402- 
421 BL01177B 13.61 
7.840e-16 155-171 
BL01177D 17.50 1.900e- 
15 427-445 


1506 


BL0O972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 5".£00e- 

11.93 7.429e-14 48-66 
BL00972E 20 7? R 759r»- 
10 341-363 


1512 


BL00523 


Sulfatases proteins. 


BL00523B 19.27 4.536e- 
22 76-106 BL00523D 
9.89 1.563e-ll 40-52 
BL00523F 10.85 4.162e- 
09 159-170 BL0D523G 
9.46 5.333e-09 256-266 


1516 


BL00914 


Syntaxin / epimorphin 
family proteins. 


BL00914 24.91 7.045e- 
14 168-218 


'1518 


BL00600 


Aminotransferases class- 
Ill pyridoxal -phosphate 
attachment si. 


BL00600A 17.98 6\l43e- 
19 98-122 BL00600E 
16.43 1.771e-X7 302- 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








331 BLOOSCOG 12.43 
9.625e-17 377-396 
BLO06O0B 19.60 5.091e- 
15 160-186 BL006O0C 
16.18 6.040e-12 190- 
206 BL006C0F 8.77 
1.000e-ll 343-356 
BLO06O0D 8.71 l.OOOe- 
10 281-295 


1523 


PD00930 


taOTEIN GTPASE DOMAIN 
ACTIVATION . 


PD00930B 33.72 9.600e- 
18 41-82 


1528 


PRD0320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PRO0320B 12.19 4.774e- 
11 192-207 PR00320B 
12.19 8.839e-il 272- 
287 PR00320B 12.19 
9.743e-10 106-121 
PRO0320A 16.74 1.878e- 
09 192-207 PR00320A 
16.74 2.317e-09 106- 
121 PR00320A 16.74 
8.683e-09 272-287 
PR00320C 13.01 8.800e- 
09 106-121 


1S38 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 0.60 4 . 508e- 
15 171-1B4 


1539 


PF00781 


Diacylglycerol kinase 
catalytic domain 
proteins (presumed) . 


PF00781D 11.11 7.593e- 
10 103-127 


1540 


PR00965 

r 


OCULAR ALBINISM TYPE 1 
PROTEIN SIGNATURE 


PR00965H 10.73 1.231e- 
29 312-334 PR00965E 
12.93 5.846e-29 172- 
195 PR00965F 5.98 

PR00965C 15.04 l.OOOe- 

27 131-151 PROOQficiD 
5.84 1.000e-27 150-170 
PR00965G 8.52 2 . 440e- 
27 258-279 PR00965B 
4.80 8.650e-26 88-109 
PR00965A 12 52 1 OOOe- 
25 35-55 PR00965I 
3.91 6.442e-25 385-406 


1541 


BL01013 


Oxys berol -binding 
protein family proteins. 


BL01013D 26.81 9.719e- 
17 163-207 


1543 


PD02699 


PROTEIN DNA- BINDING 
BINDING DNA. 


PD02699C 24.84 l.OOOe- 
40 599-646 PD02699A 
8.91 2.2B6e-34 219-248 
PD02699B 18.26 6.143e- 
21 485-509 


1544 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00O49D 0.00 7.857e- 
10 102-197 PR00049D 
0.00 7.102e-09 67-82 


1547 


BL00951 


ER lumen protein 
retaining receptor 
proteins. 


BL00951C 19.35 l.OOOe- 
40 93-142 BL00951D 
13.94 8.714e-40 142- 
177 BL00951A 15.10 
1.000e-38 2-38 . 
BL00951B 14.23 6.250e- 
33 38-69 


1548 


BL00536 


Ubigui tin-activating 
enzyme proteins. 


BL03536F 13.65 8.920e- 
30 279-318 BL00536D 
22.91 5.737e-24 21-65 
BL00536E 16.94 4.696e- 
18 248-279 


1549 


PR00139 


AS P ARAG INAS E / G LUTAM INASE 
FAMILY SIGNATURE 


PR00139C 11.72 9.679e- 
09 550-569 


1553 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 S.119e- 
09 58-73 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 


1556 


BL00061 


Short -chain 

dehydrogenases/reductase 
s family proteins. 


BL00061B 25.79 6.276e- 
13 67-105 


1557 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.105e- 
12 107-132 


1558 


BL0122B 


Hypothetical cof family 
proteins. 


BL01228D 17.44 8.105e- 
12 107-132 


1559 


Bio 122 8 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.105e- 
12 107-132 


1562 


BL00522 


DNA polymerase family X 
proteins . 


BLO0522C 11.90 S.6O0e- 
18 412-436 BL00S22B 
27.30 1.738e-16 364- 
410 BL00522A 25.52 
6.000e-16 279-326 
BL00522E 19.63 S.123e- 
14 502-532 BL00522F 
14.90 2.3B5e-13 551- 
575 


1563 


PF00651 


BTB (also known as BR- 
C/Ttfc) domain proteins. 


PF00651 15.00 l,947e- 
11 46-59 


1564 


BL00239 


Ubiquitin domain 
proteins . 


BL00299 28.84 2.B23e- 
10 324-376 


1566 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 8.594e- 
17 184-228 BL01013C 
9.97 4 .906e-12 14-24 


1567 


BL0067B 


Trp-Asp {WD) repeat 
proteins proteins. 


BL00678 9.67 3.400e-10 
378-389 BLO0678 9.67 
5.800e-l0 418-429 
BL00678 9.67 8.800e-10 
295-306 


"1570 ■ - 


Bt00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL0O479B 12. S7 5.235e- 
17 297-313 BL00479A 
19.86 6.625e-15 271- 
294 BL00479A 19.86 
2.667e-14 147-170 
BL0O479B 12.57 6.294e- 
12 173-189 


1576 


PROOFS 

- 


OXYTOCIN RECEPTOR 
SIGNATURE 


PR00665G 12.36 4.^73e- 
24 364-384 PR00665D 
9.93 1.200e-22 138-155 
PR00665F 11.73 4.it00e- 
22 337-354 PR00665C * 
5.89 l.OOOe-20 65-80 
PR0D665B 5.29 4.337e- 
19 24-39 PR00665E 
5.60 2.929e-15 246-260 
PR00665A 5,99 5.622e- 
15 11-25 


1577 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL 

D I HYDROPTER I D INE . 


DM00099B 14.73 9.308e- " 
10 127-137 


1579 


BL00524 


Somatomedin B domain 
proteins . 


BL00524A 9.65 £.776e- 
14 52-73 


1580 


PD02894 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894B 13.93 6.959e- 
16 182-215 PD02894A 
21.96 2.125e-10 57-103 


1581 


BL00411 


Kinesin motor domain 
proteins. 


BL00411C 15.04 5.292e^ 
12 32-54 BL00411H 
15.66 4.441e-ll 245- 
276 


1582 


PR00604 


CLASS IA AND IB 
CYTOCHROME C SIGNATURE 


PR00604A 11.13 2.440e- 
09 79-87 


1584 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 1.000c- 
10 225-238 


1S85 


DM015S1 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 9.455e- 
11 125-145 


1586 


DM01354 


kw TRANSCRIPTASE REVERSE 
II 0RF2. 


DM01354S 11.61 7.750e- 
09 474-495 
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NO. 


"description 


RESULTS* 


1587 


PR00072 


MALIC ENZYME SIGNATURE 


PR00072B 13.77 7.955e- 
33 180-210 PR00072A 

19 75 £ DA0f*~? c ; 19ft- 

145 PR00072C 11.42 
2.286e-24 216-239 
PR00072D 10.77 3.400e- 
22 276-295 PR00072E 
10.54 1.360e-19 301- 
318 PR00072Q 10.45 
5 304e-19 433*450 
PR00072F 8.37 5.935e- 
15 332-349 


1589 


BL00191 


Cytochrome b5 family, 

iiesmts— ujiiiu x uy uumain 
proteins. 


BL00191H 15.64 1.537e- 
99 ci -ill DT.nni oi v 

17.38 9.027e-12 398- 
442 


1590 


DM01970 


0 kw ZK632.12 YDR313C 
ENUOSOMAL III. 


DM01970B 8.60 7.716e- 
13 211-224 DM01970B 
8.60 2.157e-12 94-107 


1S91 


DM00517 


5 Jew NUCLEAR 60.7 NUP1 
CHROMOSOME. 


DM00517B 10.96 6.625e- 
16 1175-1193 DM00517A 
8.21 1.000e-ll 1015- 
1026 


lb92 


BLiUu037 


Myb DMA-binding domain 
proteins repeat proteins 
proceins . 


BL00037B 15.92 3.250e- 
27 116-142 BL00037A 
lo. bo 2.50ue-24 83-107 
BL00037A 16.68 3.250e- 
12 31-ba BijOOO 3 /B 

15.92 3.526e-ll 64-90 
10 146-164 


1595 


BL00028 


£-> -L U t-. IXlJHCI f ^.Zn L- y UC f 

domain proteins. 


"ht fttift9fT"i c n*7 i ci Aft, 

09 110-127 


1598 


PP00628 


tr rLU linger . 


rr UUbiio Is ■ j .ZdUc- 
11 1667-1682 


1599 


PR00014 


"FIBRONECTIN TYPE III 


PR00014D 12.04 5.500e- 

V9 jOU" J J 3 


l£66 


BL00518 


Zinc finger, C3HC4 type 


BL00518 12.23 6.571e- 
in m.io 


1602 


BL00412 


Neuromodulin (GAP-43) 
proteins . *" 


BL00412D 16.54 5.402e- 
10 136-187 


1605 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins • 


PF00651 15.00 3.571e- 
10 44-57 


1*07' — 


BL00252 


Interferon alpha, beta 

aQu UcXLu JL auliJLy 

proteins . 


BL00252A 18.49 6.657e- 
19.78 9.125e-16 58-109 


1610 


DM0021S 


PROLINE- RICH PROTEIN 3. 


DM00215 19.43 l.OOOe- 
08 61-94 


1611 


BL00904 


prenyl transferases alpha 
subunit repeat proteins 
proteins . j 


10 91-125 BL00904D 
1.47 6.01Be-09 127-168 


1612 


PF00168 


C2 domain proteins. 


PF00168C 27.49 3„250e- 
09 365-391 


1613 


BL00412 


Neuromodulin (GAP-43) 


BL00412D 16.54 6.051e- 

16.54 7.1S3e-09 933- 
984 


1614 


BL00559 


Eukaryotic molybdopterin 
oxidoreduc tases 
proteins. 


BL00559I 13.63 3.531e- 
25 54-83 BL00559K 
13.17 2.957e-18 197- 
224 BL00559J 19.63 
6.870e-16 124-176 
BL00559L 13.60 9.000e- 
16 266-284 


1615 i 


PD01427 


TRANSFERASE 
METHYLTRANS FERASE BI . 


PD01427B 22.45 3.025e- 
22 500-541 PD01427A 
19.94 8.773e-18 439- 
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WO. 




RESULTS* 








472 


1616 


BL0011S 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins. 


BL0011SZ 3.12 7.4BSe- 
09 152-201 BL00115Z 
3.12 9.603e-09 145-194 


1617 


BL003OT " 


a-iuu/iuoBf type calcium 
binding protein. 


BL00303B 26.15 7.750e- 
32 51-88 BL00303A 
21.77 1.947e-31 4-41 


1616 


BL01254 


Petuin family proteins. 


BL01254F 10.02 8.754e- 
09 137-147 


1619 


PD01888 


PEPTIDE REDUCTASE 
PROTEIN METHI . 


PD01888B 25.10 l.OOOe- 
40 47-97 PD01888C 
21.56 7.000e-30 125- 
155 PD01888A 12.84 
8.800e-15 7-23 


1621 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 3.455e- 
09 692-704 PR00239B 
1.58 4.580e-09 697-709 
PR00239E 1.58 4.580e- 
09 702-714 PR00239E 
1.58 5.193e-09 703-715 


1622 


PR00860 


VERTEBRATE 
METALLOTH I0NEIN 
SIGNATURE 


PR00860B 7.04 1.900e- 
1B 27-41 PR00860C 
9.61 1.474e-14 41-51 
PR00860A 5.46 1.720e- 
14 5-18 


1624 


PR00784 


MITOCHONDRIAL BROWN PAT 
UNCOUPLING PROTEIN 
SIGNATURE 


PR00784D 15.86 8.027e- 
11 77-95 


1626 


BL00325 


Act in-depolymeri zing 
proteins. 


BL00325B 21.66 l.OOOe- 
40 93-139 BL00325A 
24.83 6.786e-23 61-93 


1631 


BL00064 


L-lactate dehydrogenase 
proteins. 

r 


BL00064B 23.57 l.OOOe- 
40 82-130 3L00064C 
17.28 1.000e-40 137- 
182 BL00064E 27.20 
l.OOOe-40 223-275 
BL00064F 25.14 7.882e- 
36 286-331 BL00064A 
21.16 1.000e-33 22-60 
BL00064D 14.19 6.500e- 
31 182-212 




PKQQQ63 


RIBOSOMAL PROTEIN L27 
SIGNATURE 


PR00063B 15.24 9.700e- 
11 59-84 PR00063A 
11.71 1.614e-09 34-59 


1634 




MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239D 0.00 l.lOSe- 
11 36-49 PR00239C 
3.51 2.538e-09 37-45 


'1636 


DT.ni bin 


Caveolins proteins . 


BL01210B 13.92 9.531e- j 
10 133-183 


1637 


BL00982 


Bacterial- type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 5.3BBe- 
11 11-43 


1639 


BL01183 


ubiE/COQ5 

methyl transferase family 
proteins. 


BL01183B 21.31 8.144e- 
12 132-177 


1640 


PR0001* 


GRAM- POSITIVE COCCUS 
SURFACE PROTEIN ANCHOR 
SIGNATURE 


PR00015B 9.84 8.468e- 
10 128-149 


1641 


PR00320 


G- PROTEIN BETA WD- A 0 
REPEAT SIGNATURE 


11 364-379 PR00320A 
16.74 7.828e-ll 364- 
379 PR00320C 13.01 
2.800e-10 279-294 
PR00320C 13.01 2.800e- 
10 364-379 PR00320B j 
12.19 5.114e-10 279- 
294 PRO0320A 16.74 
1.659e-09 279-294 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PR00320A 16.74 2.09Be- 


1642 


PP00023 


AnX repeat proteins. 


PF00023A 16.03 6.464e 7 
09 114-130 


1643 


PROOFS 


POTASSIUM CHANNEL 

signature; 


PR00169A 16.77 1.806e- 
11 74-94 


1644 


BL00678" 


Trp-Asp (WO) repeat 
proteins proteins. 


BL00678 9.67 2.200e-10' 
109-120 BL00678 9.67 
5.737e-09 528-539 


1645 


BL01108 


Ribosomal protein L24 
proteins . 


BL01108A 20.33 7.366e- 
17 56-89 


1646 


PRO03B0 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9 . 270e- 
21 103-125 PR003B0D 
9.93 6.30Be-18 386-408 
PR00380C 13.18 7.923e- 
16 332-351 PR00380B 
12.64 6.657e-15 292- 
310 


1647 


DM01242 


3 THREONINE- -TRNA 
LIGASE . 


DM01242C 17.15 9.791e- 
37 340-381 DM01242E 
23.00 S.071e-31 463- 
505 DM01242D 23.29 
3.925e-30 420-463 
DM01242B 23.57 8.054e- 
18 265-314 DM01242F 
10.61 7,618e-14 526- 
540 


164 9 


PDQU126 


PROTEIN REPEAT DOMAIN 
TPR NOCLEA. 


PD00126A 22.53 5.500e- 
10 13-34 


1 eel 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.720e- 
11 431-485 




TJT ft ft C\ -% ^ 

BUO0 933 


FGGY family of 
carbohydrate kinases 
proteins . 


BL00933A 17.50 4.673e- 
12 11-35 BL00933E 
13.80 9.217e-09 456- 
472 


1653 


BL00795 


Involucrin proteins. 


BL00795C 17.06 2.98Be- 
10 70-115 


1654 


BLO09B2 


Bacterial -eype phytoene 
dehydrogenase proteins. 


BL00982A 18.41 LlSOe- 
17 302-334 


1655 


BL009B2 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 7.750e- 
17 282-314 r 


1656 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 1.391e- 
16 607-630 


1657 




TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 7.938e- 
11 114-136 


1658 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 B.889e- 
10 442-455 


1659 


BLO0972 


Ubiguitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 4.140e- 
12 376-401 BL00972E 
20.72 5.629e-09 446- 
468 


1660 




Actins proteins. 


BL00406D 12.58 6.767e- 
15 1B8-243 


1661 


PR00105 


CYTOSINE-SPECIFIC DNA 
METHYLTRANS FERAS E 
SIGNATURE 


PR00105A 10.36 4.900e- 
13 1140-1157 PR00105B 
12.32 2.800e-12 1259- 
1274 PR00105C 10,86 


1662 


BL002BO 


Pancreatic trypsin 
inhibitor (Kunitz) 
family proteins. 


BLO02B0 24.^1 3.172e- 
33 3119-3163 


1663 


PR00319 


BETA G- PROTEIN 
( TRANS DUC IN) SIGNATURE 


PR00319D 11.64 6.625e- i 
23 107-125 PR00319C 
13.41 5.714e-20 89-105 
PR00319A 15.27 5.286e- 
19 51-68 PR00319B 
11.47 8.200e-19 70-85 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1664 


BL0O018 


EF-hand calciura-Dinding 
uuniain proLeins * 


BL0O01B 7.41 S.OSOe-lO 
489-502 


1667 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 


PD01066 19.43 8.500e- 
38 7-46 


1669 


BL011S3 


N0Ll/N0P2/sun family 


BL01153D 19.69 1.188e- 
17 -15-141 BL011S3C 
13.67 8.977e-15 66-80 
BL01153B 20.52 1.885e- 
10 13-37 


1671 


PR0067B 


PI3 KINASE PB5 
SIGNATURE 


PR0067BH 9.13 3.100e- 
10 1146-1169 


1672 


BL0O598 


Chrorao domain proteins. 


BL00598 14.45 8.500e- 
20 27-49 


1673 


PR0O326 


GTP1/0BG GTP-BIND1NG 
PROTEIN. FAMILY SIGNATURE 


PR00326A 8.75 8.329e- 
09 686-707 


1674 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.S80e- 
11 343-358 PR00049D 
0.00 1.2B6e-10 342-357 


1676 


PR00747 

! 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 


PR00747H 12.76 8.636e- 
19 427-448 PR00747G 
14.50 2.286e-18 368- 
393 PR00747C 12.06 
7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747D 
15.23 8.759e-17 163- 
1B3 PR00747E 15.13 
B.244e-15 254-272 
PR00747B 7.65 5.355e- 
13 75-90 PR00747F 
13.56 8.714e-10 311- 
328 


1677 ' 


PRO 074 7 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 

r 


PR00747H 12.76 8.636e- 
19 309-330 PR00747G 
14.50 2.286e-18 250- 
275 PR00747C 12.06 
7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-6T PR00747B 
7.65 5.35Se-13 75-90 
PR00747F 13.56 8.714e- 
10 193-210 


16B0 


BL00*7B 


Trp-Asp (wd) repeat 
proteins proteins. 


BL00678 9.67 4.600e-10 
406-417 BL0067B 9.67 
6.684e-09 320-331 


1681 




Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 4.£0Oe-lO 
329-340 BL00678 9.67 
6.6B46-09 243-254 


16B3 


PR00326 


GTPl/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 1.346e- 
13 389-410 


1685 


r^UUO'Jo 


RDC1 ORPHAN RECEPTOR 
SIGNATURE 


PR00646H 6*. 32 4.188a- 
09 755-771 


1690 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- ~ 
09 75-129 


1691 


PR00456 


RIBOSOMAL PROTEIN" P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 418-433 PR00456E 
3.06 7.281e-10 419-434 

10 420-435 


1692 


PR00456 1 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 487-502 PR00456E 
3.06 7.2Ble-10 488-503 
PR00456E 3.06 B.125e- 
10 489-504 


1693 


BL0U674 


AAA-protein family- 
proteins . 


BL00674C 22.60 8.043e- 
24 274-317 BL00674B | 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








4.46 4.000e-23 241-263 
BL00674D 23.41 8.560e- 
18 338-385 BL00674E 
15.24 1.720e-15 414- 
434 


1697 


PR00409 


PHTHAIiATE DI0XYGENASE 
REDUCTASE FAMILY 
SIGNATURE 


PR00409F 12.70 4.388e- 
10 427-447 


1698 


PR00466 


CYTOCHROME B-245 HEAVY 
CHAIN SIGNATURE 


PR00466C 10.17 3.443e- 
13 187-208 PR00466B 
S.03 5.500e-ll 162-186 
PR00466F 9.16 6.159e- 

*» 3 O 3 ± l 


1699 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.217e- 
12 283-300 BL00028 
16.07 3.769e-ll 255- 

0*7*3 dj nnmo n n 
2 / 2 £Jl»O0Q2o lb. 07 

5.154e-ll 171-188 
BL00028 16.07 S.SOOe- 

11 / oliUUUZo 

16.07 1.600e-10 199- 
216 


1700 




family proteins. 


BLiUlUiyA 13.20 3.34Be- 
15 62-102 BL01019B 
19.49 4.000e-15 107- 
162 


1703 


PD010 66 


PROTEIN ZINC FINGER 
BINDING NU. 


PD01066 19.43 2.484e- 
12 ZU0-Z39 


1707 


PR0D109 


TYROSINE KINASE^ 
CATALYTIC DOMAIN 


PR00109B 12.27 4.55Be- 
14 134-153 


1710 


PR00019 


LEUCINE-RICH REPEAT 

QTHNATTTPR 


PR00019A 11.19 2.565e- 
10 llb-130 PR00019B 

11.35 4.600e-09 113- 
1 5 7 DRnnm qd n ic 

7.120e-09 204-218 


1711 


PXoiTSS 


WW/irsr>S/WWP domain 

proteins . 


11 232-247 BL01159 

11 85 5 4ftftp-10 fill- 

628 


1712 


PF00023 


Ank repeat proteins. 


PF00023A 16\03 7.000e- 

10 187-50'? 

Iw AO/ *«UJ 


1713 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3 -H tvoe (and similar) 


PF00642 11.59 9.550e- 


1714 


PF0064 2 


Zinc finger C-xB-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.550e- 
11 230-241 


1715 


BL01115 


protein ran proteins. 


BL01115A 10 22 *7 iPO P , 

09 7-51 


171B 


BL00353 


HMG1/2 proteins. 


BL003S3C 14 83 6 01 rV-"" 
10 136-183 BL00353B 
11.47 8.866e-09 86-136 


1719 


BL00412 


Neuromodulin (GAP-43) 
proteins. 


BL00412D 16.54 5.408e- 
09 432-483 


1721 


BLD003 8 


Myc- type, ' helix- loop- 
helix 1 dimerization 
domain proteins. 


BL00038B 16.97 8.448e- 
12 79-100 BL0003BA 
13.61 4.000e-ll 52-68 


1723 


PD00S67 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567C 9.17 8.500e- 
09 418-428 


1724 


BL01279 


Protein-L- 
isoaspartate {D- 
aspartate) 0- 
methyl transferase signa. 


BL01279A 24.27 5.663e- 
12 233-281 


1728 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 2.059e-ll 
73-86 * BL00018 7.41 
4.176B-11 157-170 


T730 


BL00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 1.089e- 
09 17-61 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1731 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.£76e- 
10 296-350 


1732 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.676e- 
10 316-370 


1733 


PF00850 


Histone deacetylase 
family . 


PFOOB50F 15.70 4.349e- 
22 246-279 PF0085DD 
14.76 6.850e-20 177- 
201 PF00850E 8.88 
8.691e-18 209-235 
PF00850G 22.75 4.09Be- 
14 281-323 


1734 


BL003 54 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 5.932e- 
09 292-307 


1735 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 S.263e- 
10 492-502 


1743 


PRO 04 49 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13 20 1 lflR**- 
11 5-27 PR00449D 
10.79 2.241e-10 109- 

123 PR00449E 13 SO 
9.289e-10 144-1S7 


1744 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.18Be- . 
11 5-27 PR00449D 

123 PR00449E 13.50 
9.289e-10 144-1G7 


1745 


BL0 0720 


Guanine -nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B id ^1 R ^>q7t»- 
15 136-160 


1746 


PR0O081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.727e- 
11 45-57 PR00081E 
17.54 3.935e-10 150- 
168 


1747 


BL00439 


Acyl transferases 
ChoActase / COT / CPT 
family proteins* 


BL00439H 18.24 8.435e- 
14 65-91 BL00439G 
13 40 2 S95f»-1? 7-14 


1749 


PR00819 


CBXX/CFQX SUPERFAMILY 


PR00819B 10.83 7.15Be- 
11 4-20 


1751 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD0O066 13.92 3.400e- 
14 33-46 PD00066 
^13.92 1.000e-13 89-102 
PD00066 13.92 7.000e- 
13 61-74 PDOO0S6 
13.92 6.571e-12 117- 
13 0 


1753 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 6.516e- 
18 33-77 


1754 " 


BLOb^d 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20 . 01 2.393e- 
09 490-521 BL00790I 
20.01 2.821e-09 60-91 
BL00790I 20.01 6.357e- 
09 287-318 


1756 - * 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.7S0e- 
35 10-49 


1758 


DM00406 


GL I AD IN. 


DM0O4O6 7.73 7.600e-09 
653-656 


1762 


PD02929 


ADHESION GLYCOPROTEIN- 
PRECURSOR I. 


PD02929A 28.27 4.*29e- 
09 224-278 


1765 


PR00326 


GTP1/0BG GTP -BINDING 
PROTEIN FAMIIiY SIGNATURE 


PR00326A 8.75 5.950e- 
11 146-167 


1775 


PF00023 


Ank repeat proceins. 


PF00023A 16.03 3.077e- 
14 523-539 


1776 


BL00942 


glpT family of 
transporters proteins. ! 


BL00942F 15.07 4.343e- 
10 371-389 BL00942B 
20.36 8.040e-09 94-137 


1777 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.373e- 
09 279-312 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1778 


BX.00084 


Copper type II, 

a s c orba t e - depende n t 

tnonooxygenases proteins. 


BL00084D 25.11 3.700e- 
20 169-224 BL00084B 
24.26 8.134e-16 10-58 
BL00084C 27.71 8.412e- 
11 107-15B : 


1779 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 3.758e- 
18 611-655 BL01013A 
25.14 2.891e-15 344- 
380 BL01013C 9.97 

6 308^-15 diq./idt; 

BL01013B 11.33 3.717e- 
12 409-420 


1783 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 


1784 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 



* results include in order: accession number subtype; raw score; p- value; postion of 
signature in amino acid sequence. 
TRADOCS: 14 16223J(%CRJ0l l.DOC) 
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TABLE 4 



SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PPAM 
SCORE 


2 


is 


Immunoglobulin domain 


2.1e-32 


109.5 


3 


pkinase 


Eukaryotic protein kinase 
domain 


1.3e-29 


110.7 


4 


z£-C2H2 


2inc finger, C2H2 type 


1.6e-21 


84.9 


5 


fn3 


Fibronectin type III domain 


0 


1097.1 


£ 


fn3 


Fibronectin type III domain 


0 


1035.0 


7 


fn3 


Fibronectin type III domain 


0 


1090.4 


8 


fn3 


Fibronectin type III domain 


0 


1097.1 


9 


TBC 


TBC domain 


4e-40 


"146\7 


10 


p450 


Cytochrome P450 


9.5e-l7 


62.0 


12 


ank 


Ank repeat 


"~6e-20 


■"■79.7 


14 


isr 


Immunoglobulin domain 


1.7e-05 


22.7 


15 


zf-MYND 


MYND finger 


1 .3e-06 


2 D * *k 


1* 


zf -MYND 


MYND linger 


x - je-uo 


35.4 


17 


zf-C2H2 


Zinc finger, C2H2 type 


x . /e*5? 


343.9 


18 


CAP GLY 


CAP-Gly domain 


1.2e-25 


98.7 


20 


IMPDH_C 


uciiyiuuycllaSc / isjJ"Hr 

reductase C terminus 


1 . be-119 


410 .5 


"21 


IMPDH C 


reductase C terminus 


4 . 3e-l02 


352 .6 


22 


pkinase 


Eukarvotic orotein kinan<» 
domain 




277 . 0 


23 


pkinase 


Eukaryotic protein kinase 
domain 


8.4e-74 


258.6 


25 


RNA_jpol_A 


RNA polymerase alpha subunit 


A 

u 


1077 . 7 


26 


Clq 


Clq domain 


1 Qa.l D 


44 . 4 


27 


Ribosornal L2 
3 


Ribosornal Drote^n L23 


^ Bo.n 
/ . oe-j^ 


111 . 2 


28 


Ribosornal L2 
3 


Ribosornal protein L23 


le-29 


104 . 2 


30 


zf-A20 


A2 0-1 ike zinc finger 


1 . 5e-10 


4 8.5 


31 


zf-A20 


A20-like zinc finger 


1.5e-10 


48.5 


32 


FMN_dh 


FMN- dependent dehydrogenase 


5 , 4e-179 


DUO . X 


34 


PID 


Phosphotyrosine interaction 
domain (PTB/PID) 


3.8e-59 


209.9 


35 


ig 


Immunoglobulin domain 


1.4e-13 


48.8 


36 


ig 


Immunoglobulin domain 


X - *B-lJ 


48.8 


40 


kinesin 








44 


Ets 


Ets-domain 


1 . 4e-56 


182 . 1 


45 


Ets 


Ets-domain 


1.4e-56 


182.1 


46 


LRR 


Leucine Rich Repeat 


X* fc*U 


58 ,3 


48 


Zf-C2H2 


Zinc finger, C2H2 type 


2 . 3e-lS2 


CO □ 


49 


IT AM 


Immunoreceptor tyrosine -based 
activation mot 


1.4e™05 


Jl . f 


50 


UCH-2 


Ubiguitin carboxyl -terminal 
hydrolase family 


1 . le-26 


102 . 0 


51 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


l.le-26 


102.0 


52 


ras 


Ras family 


6.5e-45 j 


162.3 


53 


PRK 


Phosphoribulokinase 


2.1e-65 


230.7 


54 


mybJDNA- 
bindlng 


Myb-like DNA-binding domain 


0.056 


15 .2 


55 


voltage_CLC 


Voltage gated chloride channels 


3.3e-186 


631.9 


56 


sugar_tr 


Sugar (and other) transporter 


0.00015 


-64.3 


~sT 


TBC 


TBC domain 


2.2e-37 


137.6 


58 


ank 


Ank repeat 


5.9e-25 


96.3 


59 


ank 


Ank repeat 


"5.9e-25 


96.3 


67 


PMP22_Claudx * 
n 


PMP-22/EMP/MP20/Claudin family 


7.9e-49 


175.6 


68 * 


C2 


C2 domain 


7.9e-54 


192.2 


69 


C2 


C2 domain 


2.3e-54 


194.0 


70 


Kelcn 


Kelch motif 


9.4e-99 


341.5 


72 


*g 


Immunoglobulin domain 


8.2e-28 


94.7 


73 


pkinase 


Eukaryotic protein kinase 


8e-69 


242.1 
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1 SEQ ID 
1 NO: 
1 


PFAM NAME 


DESCRIPTION 


— 

p-value 


PFAM 
SCORE 






domain 






74 


pkinase 


Eukaryotic protein kinase 
domain 


£. , oe-Jo 


140 . 6 


7^ 


z£- 

C4_Topoi3om 


Topoisomerase DNA binding C4 
zinc fing 


5 . 4e-54 




83 


Peptidase S9 


Prolyl oligopeptidase family 


4 - 5 e -10 


"'-id 5 

JO * 0 


84 


"fn3 


Fibronectin type til domain 


4.1e-51 


183.2 j 


|'8ct 


SH2 


Src homology domain 2 


J • -i-C ~ 


Si .1 


88 


ig 


Immunoglobulin domain 


o nnoi 


14 , 0 


[ 09 


WD40 


WD domain, G-beta repeat 


« • lc — z i. 


84 . 6 


92 


laminin G 




© . xe-2 / 


98.5 


93 


AMP-binding 


AMP-binding enzyme 


2.4e-13 


-37.2 


95 


pkinase 


domain 


1.4e-59 


211.4 


96 

1 


pkinase 


u u/va iyutiu ^iULCAH JVlJlaoe 

domain 


2 . 6e-Sl 


183 .9 


97 




Short Chain dftVlY/rlTVlfTOnaac* 


2e-61 


217 .5 


98 


kinesin 


Kinesin motor domain 


2.2e-86 


300.4 


[ 101 


IRS 


no uomain llKib-l type; 


5 .4e-36 


133.0 


102 


AAA 


smrd&es associates with various 
cellular act 


6 . Be-p5 


-5.2 


104 

|"l06 




ouwryocic procein Kinase 
domain 


2 .7e-73 


256.9 




ras 




8.3e-24 


-92.5" 


1 107 . 


FYVE 


rive zinc ringer 


5.4e-27 


100.7 


108 


Cyt_reductas 

Q 


FAD/NAD-binding Cytochrome 
reductase 


7.7e-61 


215.5 


[ 109 


zf -C2H2 


«*nc ringer, czhz type 


2.3e-122 


420.0 


113 


pkinase 


austuyotic procein Kinase 
domain 


4e-88 


306.2 


116 


PH 


en uumain 


3.1e-ll 


45.2 


ixi 


lipocalin 


Lipocalin / cytooolic fatty- 

\ n "J 1 apt »w» 

oi»Au oxnaxng pr 


2.4e-14 


53.5 


118 


pkinase 


aiuvaiyouic ptocein Kinase 
domain 


4 .5e-20 


76.3 j 


120 


WD40 




2 .4e-14 


61.1 


1 121 


WD40" 




2 .4e-14 


61.1 


123 


IF5 eIF4 elF " 

2 


uir^-gamma/ eiro/eir^J-epsilon 


le-32 


122.2 


124 " 
1 127 


ig 




6 . 5e-08 


30.6 




ml to carr 


Mitochondrial carrier proteins 


3e-16 


58.6 


128 


PP2C 


nvbDiUt puOopilataSS «£(J 


2 . 2e-71 


250.6 


129 " 


ATP1G1 PLM M 
ATS 


ATPIGI/PLM/MATB ramily 


3 .le-20 


80.6 


1 130 


pfkB 


ir* ,,v *» tcxiu-*. j.y uatuoiiyaxate Kinase 


4 . Se-42 


137.1 


j 133 


ACBP 


Acyi CoA binding protein 


4 .6e-22 


86.7 


134 


mn 


RNA recexmition mr»t-i 4 


1 .2e-31 


118 .5 


|135 


IQ 




2 .6e-08 


41.0 


rin 


ATP1G1 PLM M" 
ATS 


ATP1G1/PLM7mAT8 £amii\/ 


9 .3e-22 


85.7 


139 


WH2 — 


wiskott Aldrich syndrome 
homology region 2 


0.00*7 


23.1 


140 


zf-C2H2 


Zinc finger, C2H2 type 


i • /e-o£ 


287.5 


141 
1 143 


Peptidase S2 
6 


Signal, peptidase I 


d . /e- JLU 


35 . 7 




arf 




i • «e-jy 


145 . 2 


(146 


KRAB 


KRAB box 


7 .3e'-30 


112 . 6 


148 


DUF6 


Integral membrane protein DC/F6 


0.096 


8.0 


1 149 
J 1S1 


PDEase 


3 '5* -cyclic nucleotide 
phosphodies terase 


3.8e-B0 


231.1 




S4 


&4 domain 


l.le-08 


42.3 


J 153 


tRNA-synt_ld 


tRNA synthetases class I (R) 


3.8e-103 


356.1 


rm 


Cyt_reductas 
e 


FAD/NAD-binding Cytochrome 
reductase 


7.8e-$ t 0 


212.2 


155 
[ 157 


ras 


Ras tamiiy 


3 .6e-28 


107.0 




actin 


Actin 


3.8e-26 


87.1 
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SEQ ID 
NO: 


p>am name 


DESCRIPTION 


p-value 


PFAM 
SCORE 


158 


Jacalin 


Jacal in-like lectin domain 


0.09 


-24.9 


160 


Zn_carb0pept 


Zinc carboxypeptidase 


5e-l3fl 


471.9 


165 


plclnase 


Eukaryotic protein kinase 
domain 


5.1e-67 


236.1 


167 


Zf-C3IIC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-Q7 


27.0 


168 


Ribosomal_sl 
5 


Ribosomal protein si 5 


l.le-06 


29.0 


1^9 


DEAD 


DEAD/DEAH box helicaGe 


le-48 


157.0 


171 


DUF59 


Domain of unknown function 
DUP59 


0.07 


-17.4 


172 


plclnase 


Eukaryotic protein kinase 
domain 


3.7e-15 


58.6 


173 


globin 


Globin 


4.6e-18 


67.4 


174 


WW 


WW domain 


7.3e-06 


32.9 


175 


ras 


Ras family 


le-31 


118. B 


178 


ATP1G1JPLM M 
AT8 


ATPlGl/PLtf/MATd family 


2.5B-17 


71.0 


179 


2f-C2H2 


Zinc finger, C2H2 type 


1.5e-99 


344 .2 


1B0 


Clq 


Clq domain 


8.8e-72 


251.9 


190 


Y_phosphataa 
e 


Protein- tyrosine phosphatase 


4 .9e-287 


967.0 


191 


efhand 


EF hand 


7.5e-16 


66.1 


193 


pkinase 


Eukaryotic protein kinase 
domain 


"6.5e-82 


2B5.6 


194 


bromodomain 


Bromodomain 


5.8e-31 1 


111.4 


195 


PALP 


Pyridoxal -phosphate dependent 
enzyme 


2.5e-64 


227 .1 


197 


DnaJ 


DnaJ domain 


1.6e-38 


141.4 


199 


RrnaAD 


Ribosomal RNA adenine 
dimethylases 


0.00018 


l*-9 


200 


acid_phospha 
t 


Histidine acid phosphatase 


2.5e-10 — 


37.2 


201 


WH2 


Wiskott Aldrich syndrome 
homology region 2 


0.O0048 


26.9 


204 


vATP- 
synt_AC39 


ATP synthase (C/AC39) subunit 


1.3e-159 


543.7 


205 


VATP- 
synt_AC39 


ATP synthase (C/AC39) subunit 


l.^e-139 


476.9 


. 20* 


ldl_recept_a 


Low- density lipoprotein r 
receptor domain 


2.4e-2S 


97.6 


209 


ank 


Ank repeat - 


1.4e-l9 


78.4 


210 


Rhomboid 


Rhomboid family 


0.0035 


1.2 


211 


Clq 


Clq domain 


1.6e-70 


247.7 


212 


UQ_con 


Ubiqui tin -conjugating enzyme 


7.4e-74 


258.8 


213 


OQ_con 


Ubiqui tin- conjugating enzyme 


le-53 


191.9 


215 


DEAD 


DEAD/DEAH box helicase 


i.8e-43 


140.4 


216 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


4.5B-21 


83.4 


218 


Glycos_trans 
f_2 


Glycoayl transferases 


4e-2i 


83.6 


219 


ig 


Immunoglobulin domain 


0.092 


10.7 


222 


WD4 0 


WD domain, G-beta repeat 


7.4e-23 


89.4 


224 


TPR 


TPR Domain 


1.2e-08 


42.1 


225 


DnaO__CXXCXGX 
0 


DnaJ central domain (4 repeats) 


1.5e-38 


141.5 


226 


DnaJ_CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1 .5e-3B 


141.5 


229 


nar / w 


Hsp70 protein 


2 . 4e-54 


194.0 


230 


GSHPx 


Glutathione peroxidases 


3 .4e-47 


170.2 


231 


tsp_L 


Thromboapondin type l domain 


0.0075 


17.1 


233 


cyclin 


Cyclin 


4.6e-144 


492.0 


234 


ras 


Ras family 


4.8e-50 


179.7 


235 


LRR 


Leucine Rich Repeat 


1.2e-30 


115.3 


236 


LRR [ 


Leucine Rich Repeat 


^.7e-29 


109.4 j 


237 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1.7e-09 


45.0 
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SEQ ID 
NO: 


PFAM NAM3 


DESCRIPTION 


p- value 


PFAM 
SCORE 


244 


dCMP_cyt_dea 
ro 


Cytidine and deoxycytidylate 
deaminase 


2.5e-05 


" 31.1 


245 




Inununoglobul in domain 


6.7e-08 


30.5 


248 


wnt 


wnt family of developmental 
sj.ynaj.111y ptotei 


9.1e-270 


742.6 


250 


mito carr 


Mitochondrial carrier proteins 


1.3e-5S 


193.6 


254 


2186 


Adenylate kinase 


1.8e-14 


55.7 


255 


Cation_efflu 
x 


Cation efflux family 


2.8e-33 


124.0 


"256 




bnj domain 


3 . 9e-14 


60.4 


257 


Aa_trans 


Transmembrane amino acid 
transporter protein 


2.6e-52 


187.2 


258 


aucnyjL a c ck in 

ase 


Adenylate kinase 


2.1e-110 


380.2 


"259 


ni 1 


HIT family 


8.2e-07 


25.3 | 


260 


Q 


PQQ enzyme repeat 


1.6e-15 


65.0 


X 0 c. 


proteasome 


Proteasome A- type and B-type 


6.5e-64 


225.7 


it) / 


pkinase 

— n-« 


Eukaryotic protein kinase 
domain 


6.3e-27 


101.0 


270 


filament 


Intermediate filament proteins 


3 .2e-150 


S12.5 


27"x 


Choi ine^ kina 
se 


Choline/ ethanolamine kinase 


2e-67 


237.4 


977 


Ribosomal S7 


Ribosomal protein S7p/S5e 


3 .3e-20 


80.6 


279 


pkinase 


Eukaryotic protein kinase 
domain 


3 .3e-7? 


26"9. 9 


nan 


WD40 


WD domain, G-beta repeat 


7 ,8e-73 


255.4 




WD40 


WD domain, G-beta repeat 


7.8e-73 


255.4 


2 84 


ZI- UrlrlL 


DHHC zinc finger domain 


4.6e-24 


93 .4 




Exonuc lease 


Exonuclease 


1.4e-67 


238.0 


"291 


"SAM 


SAM domain (Sterile alpha 
motif) 


0.034 


11.2 


292 


SAM 


SAM domain (Sterile alpha 
motif) 


0.034 


11.2 






Zinc finger, C2H2 type 


1 .4e-29 


111.7 


~29? 


zi - LzMj: 


Zinc finger, C2H2 type 


2 .2e-125 


430.0 


29* 


raito_carr 


Mitochondrial carrier proteins 


4 .le-59 


205.5 




MMu DOX 


HMG (high mobility group) box 


6.7e-29 


109.4 


302 


Glycos_trans 

L 4 


,Glycosyl transferase 


5e-87 


302. S * 




CRNA-synt 2 


tRNA synthetases class II (D, K 
and N) 


l.le-84 


294.8 


305 


VDAD 


KRAB box 


2e-44 


161.0 


306 




RKA recognition motif. 


2.7e-44 


160.6 


308 




7 transmembrane receptor 
(rhodopsin family) 


5 .2e-39 


12*. i 


309 


DNA_jpolyraera 
saX 


DNA polymerase X family 


2.4e-64 


227.2 


311 


F-box " 


F-box domain. 


9.5e-08 


39.2 


312 




Immunoglobul in domain 


6 . 8e-19 


65.9 


"313 


Ets 


Ets —domain 


8 .le-60 


192.3 


315 


Kelch 


Va 1 #» Vi rrv-^t- -5 f " """" ' 


1 .3e-l06 


367.6 


317 


'art 1 


ADP-ribosylation factor family 


3 .2e-35 


130.4 


318 




j uy jana utnsr) cransporcsr 


0 . 0003 


-73 ,f 


320 


pkinase 


Eukaryotic protein kinase 

HSUIUla 


8.1e-83 


288.6 


322 


pkinase 


Eukaryotic protein kinase 
domain 


4.9e-81 


282.6 


324 


Xlink 


Extracellular link domain 


4.5e-143 


331.5 


326 


ARID 


ARID DNA binding domain 


S.le-37 


136.4 


327 


HMG_box 


HMG (high mobility group) box 


5.7e-29 


109.4 j 


328 


cadherin 


Cadherin domain 


B.le-Bl 


281.9 


331 


chromo 


■chromo' (CHRromatin 
Organization Modifier) 


4e-i8 


66.7 


333 


peptidase M2 
2 


Glycoprotease family 


1.2e-136 


467.4 
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SEQ ID 


" PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


335 


vwa 


von Willebrand factor type A 
domain 


2.*e-07 


37.9 




ras 


Ras family 


7.8e-07 


-59.1 


J u 


zf -C2H2 


Zinc finger, C2H2 type 


B.2e-64 


225.4 




zf -C2H2 


Zinc finger, C2H2 type 


2.4e-85 


297.0 


7/17 


a 9 


Immunoglobulin domain 


0.0005 


18.0 




pkinase 


Eukaryotic protein kinase 
domain 


6.5e-65 


229.1 


74 7 




Bukaryotic protein kinase 
domain 


6 .5e-65 


229.1 


"7 err 


EGF 


EGF-like domain 


8.Se-20 


79.2 


7 CO 


- __t- 


Ank repeat 


2.5e-10l 


350.0 


354 


TBC 


TBC domain 


5.1e-lS 


63.3 


3 55 


PHD 


PHD- finger 


3 .2e-07 


37.4 


358 


DUF6 


Integral membrane protein DUF6 


0.033 


15.8 


359 


zf-C2H2 


Zinc finger, C2H2 type 


7.4e-20 


79.4 


361 


ank 


Ank repeat 


6 .6e-34 


126.1 


362 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


4.7e-53 


189.7 


363 


efhand 


BF hand 


5.4e-10 


46.6 


367 


LRR • 


Leucine Rich Repeat 


8.8e-44 


158.9 


368 


laminin G 


Laminin G domain 


1.5e-33 


121.7 


369 


PP2C - 


Protein phosphatase 2C 


5.36-20 


73.9 


372 


LIM 


LIM domain containing proteins 


9 .9e-15 


57.1 


373 


KRAB 


KRAB box 


4 .8e-23 


90.0 . 


376 


ion_ trans 


Ion transport protein 


2.9e-09 


-4.2 


377 


Beach 


Bexge /BEACH domain 


4.9e-208 


704.5 


380 


pkinaee 


Eukaryotic protein kinase 
domain 


1.6e-94 


327.5 


381 


AMP -binding 


AMP -binding enzyme 


1.4e-07 


-140.3 


382 


HECT 


HECT- domain (ubiquitin- 
transferase) . 


1.3e-07 


-13 .5 


384 


ank 


Ank repeat 


2.5e-101 


350.0 


386 




Immunoglobulin domain 


9.5e-06 


23.6 


388 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-42 


154.6 


3 89 


ig 


Immunoglobulin domain 


2.8e-15 


£4.3 


390 


mi to_carr 


Mitochondrial carrier proteins 


3.5e-67 


233.2 


392 


TPR 


TPR Domain 


6.1e-17 


69.7 


393 


SH3 


SH3 domain 


3.5e-09 


43.9 


394 


AAA 


ATPases associated with various 
cellular act 


4.1e-21 


83.6 


396 


spectrin 


Spectrin repeat 


2.1e-67 


237.3 


397 


z£-C2H2 


Zinc finger, C2H2 type 


0.0066 


23.1 


399 


fn3 


Fibronectin type III domain 


4.1e-102 


352.6 


400 


WD40 


WD domain, G-beta repeat 


0.00049 


26.8 


401 


El dehydxcg 


Dehydrogenase El component 


3e-119 


409.6 


402 


£n3 


Fibronectin type III domain 


0 


1719.6 


404 


LRR 


Leucine Rich Repeat 


2.1e-10 


48. 0 


405 


cadherxn 


Cadherin domain 


8.1e-81 


281 .9 


Ana 
406 


zf -CXXC 


CXXC zinc finger 


5e-15 


63.4 


41 n 


RhoGEF 


RhoGEF domain 


l.le-23 


92.1 


411 


F-box 


F-box domain. ! 


4 .2e-06 


33.7 


~3T5 




SNF2 and others N- terminal 
domain 


5.0e-16 


61.6 




uPfaase £j cna 
in 


Carbamoyl -phosphate synthase 
(CPSase) 


1.5e-172 


586.6 


AT Q 


LRR 


Leucine Rich Repeat 


3 .8e-24 


93.6 


419 


DENN 


ivdiviv \ a ~ j / uOuiain 


2e-58 


207 .5 


420 


RasGEF 


RasGEF domain 


8 .le-43 


155.7 


421 


ank 


Ank repeat 


1.4e-153 


523.7 


424 


G-patch 


G-patch domain 


le-19 


78.9 j 


425 


pkinase 


Eukaryotic protein kinase 
domain 


2.2e-31 


117.1 


426 


Plexin repea 
t 


Plexin repeat 


0.0023 


24.6 


427 


Plexin_repea 


Plexin repeat 


0.0023 


24.6 
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SEQ ID 
NO: 


FFAM NAME 


DESCRIPTION 


p -value 


PFAM 
SCORE 




C 








429 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


B . 6e-ll 


39.2 


431 


DEAD 


DEAD/DEAH box helicase ~ 


le-66 


214.6 


432 


SH3 


SH3 domain 


3.4Q-16 


57.2 


433 


GTP CDC 


Cell division protein 


2.1e-H4 


3 93 .5 


436 


Collagen 


Collagen triple helix repeat 
(20 copies) 


4.6e-194 


658.1 


438 


Ricin B lect 
in 


Similarity to lectin domain or " 
ricin b 


0.0085 


10. £ 


441 


Alpha adapt i 
n_C 


Alpha adaptin carbaxyi- terminal 
domai 


1 . 2e-256 


866.0 


442 


Alpha adapt i 
n_C 


Alpha adaptin carbaxyi -terminal 
domai 


1 . 8e-235 


""795.7 


443 


PD2 


PDZ domain (Also known as DHR 
or GLGF) . 


1 . 9e-65 




445 


LON 


ATP- dependent protease La (LON) 
domain 


0.00012 


-17.1 


446 


ig 


Immunoglobulin domain 


0.00011 


20 .1 ; 




sushi 


Sushi domain (SCR repeat) 


1. 4e-18 


75.2 


452 


£n3 


Fibronectin type III domain 


1.5e-06 


35.2 i 


454 


pyridoxal de 
C 


Pyridoxal - dependent 
decarboxylase conse 


O i JC"X* 


50 .3 


456 


kinesin 


Kinesin motor domain 


4 . 9&-217 


11 A A — 


457 


neur^chan 


Neurotransmitter-gated ion- 
channel 


le-175 


597.1 ~ 


458 
468 


Josephin 
bZIP 


Josephin 

bZIP transcription factor 


0.0002 


16 7 


470 


NTP_transf er 
ase 


Nucleotidyl transferase 


1 • TS— 07 
6 .3e-06 


31.8 
"-2b\3 " 


471 


"WD40 


WD domain, G-beta repeat 


2e-28 




107 . 9 


4 73 


LIM 


LIM domain containing proteins 


0 00021 


20.7 


477 


zf-RanBP 


Zn- finger in Ran binding 
protein and others. 


0.028 


21.0 


479 


WD40 


WD domain, G-beta repeat 


6 . 5e-18 


73 . 0 


? 

i 


480 


KRAB 


KRAB box 


le-31 


118.8 


481 


ArfGap 


Putative GTP -ase activating 
protein for Arf 


8 • 4e- 66 


232.0 


485 


SH2 


Src homology domain 2 


0.011 


11.4 




486 


cig 


Clq domain 


A *)o.7A 
^* .JB" f*k 




487 


dsrm 


Double- stranded RNA binding 
motif 


l.le-47 


171.9 




489 


zt-C2H2 


Zinc finger, C2H2 type 


4 . 8e- 153 


DAI* 9 




490 


Alpha adapt i 
n C 


Alpha adaptin carboxyl- terminal 
domai 


3 . 4e- 222 


TCI 




492 


SKI 


shiximate Kinase 


1 .2e-10 


48 .8 




497 


BNVJpolyprot 
ein 


env polyprotein (coat 
polyprotein) 


2.6e-22 


77.6 




498 


abhydrolase " 
2 


Phospholipase/Carboxylesterase 


0.041 


-48.1 




500 


rrm 


RNA recognition motif. 


5 .4e-34 


126 .4 




501 


WW 


WW domain 


4.6e-l8 


73.4 




502 


ig f 


Immunoglobulin domain 


1 .le-io 


39.5 ~ 




504 


abhydrolase 


alpna/beta hydrolase told 


0.045 


-3.* 




505 


vwa 


von Willebrand factor type A 
domain 


7 .le-62 


219 . 0 




508 


Na_K ATPa3e 
C 


Na+/K+ ATPase C- terminus 


2 . 3e-145 






509 


Exonuclease 


Exonuclease 


1.3e-56 


201.5 




510 


Glycos trans 


Glycosyl transferases group 1 


2.96-06 


27.0 ; 




511 


Glycos trans 
f 1 


Glycosyl transferases group 1 


2.9e-06 


27.0 




512" 


Glycos trans 
f_l 


Glycosyl transferases group 1 


1.96-09 


38". 5 




514 


pro isomer as 
e 


Cyclophiiin type pep t idyl - 
prolyl cis-tr 


l.Be-63 


221.4 
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S*EQ 10 


PFAM name 


DESCRI FTXON 


p-value 


PFAM 
SCORE 


515 




Cibf »iiice domain 


l.9e-18 


74 .7 


ST6" 




Sutp bioduXe 


4 .3e-38 


140.0 


523 


i 9 


Immunoglobulin domain 


3.3e-06 


25 .0 


526 


UBX 


UBX domain 


l.le-34 


128.6 


528 


dun zinc 


Zinc -binding dehydrogenases 


2.7e-34 


127.4 


530 


'SAM 


bAM domain (Sterile alpha 
motif) 


0.046 


10.0 


531 


adh short 


short chain dehydrogenase 


0.0025 


-34.1 


532 


mito_carr 


Mitochondrial carrier proteins 


a.Se-81- 


281.7 


533 


mito carr 


Mitochondrial carrier proteins 


2e-61 


213.5 


534 


thiolase 


Thiolase 


3.5e-183 


622.0 


535 


FMO-like 


Flavin-binding monooxygenase- 
like 


0 


1153.7 


536 


SCAN 


SCAN domain 


4e-55 


196\ 6" " 


53 7 


CRNA~synt_l 


tRNA synthetases class I (l, L, '" 
M and V) 


3.1e-l36 


466.0 


538 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


3.1e-136 


466.0 


539 


tRNA-synt_l 


tRNA synthetases class I it, L, 
M and V) 


1.9e-117 


403.6 


540 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


3.1e-136 


466.0 


541 


vATP-synt_E 


ATP synthase (E/31 kDa) subunit 


5.9e-85 


295.7 


543 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-S§ 


242. 4 - 


544 


DUF101 


Protein of unknown function 
DUF101 


8.5e-3B 


139.0 


545 


TGFbjpropept 
ide 


TGF-beta propeptide 


l.le-67 


238.2 


547 


WD40 


WD domain, G-beta repeat 


2.6e-32 ' 


120.8 


548 


RHD 


Rel homology domain (RHD J . 


• 1.6e-238 


686.2 


549 


MMR_HSR1 


GTPaee of unknown function 


S.4e-67 


236.0 


551 


HECT 


HBCT-domain (ubiguitin- 
transferase) . 


4.3e-127 


435.6 


554 


MHC_II_alpha 


Class II histocompatibility 
antigen, alp 


3.5e-74 


259.8 " 


555 


z£-DBRl 


Putative zinc finger in N- 
recognin 


3.3e-16 


67.3 


556 


Kelch 


Kelch motif 


5.5e-29 


109.7 


561 


AMP -binding 


AMP -binding enzyme 


2.8e-06 


-163.7 " 


562 

■ cca 


PABP 


Poly- adenylate binding protein, 
unique domai 


4.9e-38 


139.8 


5o4 


Gag_p3 0 


Gag P30 core shell protein 


1.2e-67 


238 .2 


566 


PWWP 


PWWP domain 


8.1e-l6 


66.0 


567 

cca 


SCAN 


SCAN domain 


7.3e-68 


238.9 


569 


pkinase 


Eukaryotic protein kinase 
domain 


1.5e-84 


294.3 


*70 


pkinase 


Bukaryotic protein kinase 
domain 


1.5e-84 


294.3 


571 


CN_hydrol as e 


Carbon- nitrogen hydrolase 


0.00081 


-79.7 


5*72 


myosin__head 


Myosin head (motor domain) 


0 


1495.2 


P /P 


ray os in__head 


Myosin head (motor domain) 


0 


1490.4 


P / P 


Surp 


Surp module 


1.7e-23 


91.5 


PrO 


Surp 


Surp module 


1.7e-23 


91.5 


577 "" " 


DNA pol B 


DNA polymerase family B 


0 


1138.6" 


p"78 "" ' " 


"PTTT 


PDZ domain (Also known as DHR 
or ui/Gr} . 


B.3e-09 


42,7 | 


579 


T.PP 
URK 


Leucine Rich Repeat 


4.9e-21 


83.3 


580 


neur_chan 


Neurotransmitter-gated ion- 
channel 


P . PB - 1 / f 


D U J_ . P 


"583 


sushi 


Sushi domain (SCR repeat) 


0 


1673.0 


584 


DEAD 


DBAD/DEAH box helicase 


7.3e-36 


116.3 


586 


KH- domain 


kh domain 


2.9e-13 


57.5 


587 


G-patch 


G-patch domain 


2.3e-14 


61.2 


589 


LIM 


LIM domain containing proteins 


2.3e-36 


133.4 


590 


bromodomaln 


Bromodomain 


6.6e-32 


114.7 


591 


bromodomain 


Bromodomain 


6.6e-32 


114.7 J 
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SEQ ID 

NO : 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


592 


horraone_rec 


Ligand- binding domain of 
nuclear hormone 


3 . Se-22 


87 .1 


593 


PUD 


PHD- linger 


3 . 8e-12 


53 . 8 


594 


cadherin 


Cadherin domain 


4.2e-99 


342.7 


596 


pkinase 


Eukaryotic protein kinase 
domain 


5e-92 


319 .2 




WD40 


WD domain, G-beta repeat 


0 . 00054 


26 . 7 


600 


PG-GAP 


FG-GAP repeat 


4.3e-75 


262 .9 


602 


G_Aoap t_CT 


Gamma" adapt in, C- terminus 


1. le-53 


191.8 


603 


pkinase 


Eukaryotic protein kinase 
doma in 


2.3e-86 


300.4 


60s 


Collagen 


Collagen triple helix repeat 
(20 copies) 


8e-42 


152.4 


606 


mito carr 


Mitochondrial carrier proteins 


6.3e-67 


232.3 • 


608 


PWWP 


PWWP domain 


2 . 6e-2B 


107.5 


609 


PWWP 


PWWP domain 


2. 6e-28 


107.5 


613 


CAP GLY 


CAP-Gly domain 


0.0046 


20.1 


615 


R FX — DNA_b ind 
ing 


RFX DN A- binding domain 


5.2e-54 


192.9 


616 


kinesin 


Kinesin motor domain 


l.le-81 


284.8 


617 


kinesin 


Kinesin motor domain 


8.4e-80 


27B.5 


618 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.0098 


13.1 


620 


MATH 


MATH domain 


7.8e-05 


22.2 


621 


Y_phosphatas 
e 


Protein- tyrosine phosphatase 


1.4e-32 


121.6 


622 


pkinase 


Eukaryotic protein kinase 
domain 


4.4e-40 


146.6 


623 


BNR 


BNR repeat 


2.1e-ll 


51.3 


624 


roolybdopteri 
n 


Prokaryotic raolybdopterin 
oxidoreductas 


i".4e-12 


42.2 


625 


TPR 


TPR Domain 


l.le-17 


72.2 


627 


cNMP_binding 


Cyclic nucleotide-binding 
domain 


3 ,7e-5B 


205 .6 


630 


adh_short 


short chain dehydrogenase 


5e-17 


70.0 


631 


zf -C2H2 


Zinc finger, C2H2 type 


2.1e-88 


307.1 


£32 


rrm 


RNA recognition motif . 


H4e-05 


30.5 


635 


pkinase 


Eukaryotic protein kinase 
domain 


1.6e-104 


360.7 


636 


Fork_head 


Fork head domain , * 


5. 9e-27 


163.0 ' 


637 


pkinase 


Eukaryotic protein kinase 
domain 


3 .8e-70 


246.5 


642 


TPR 


TPR Domain 


4.8e-06 


40.1 


643 


ef hand 


EF hand 


1. 9e-27 


104.6 


647 


SNF2J* 


SNF2 and others N-terminal 
domain 


1.2e-101 


351.1 


648 


PseudoU synt 
h 2 


RNA pseudouridylate synthase 


1.9e-55 


197.6 


650 


zf -C2H2 


Zinc finger, C2H2 type 


0.0087 


22.7 


651 


ank 


Ank repeat 


1 .3e-17 


71.9 


652 


I_LwEQ 


I/LWEQ domain 


9.5e-101 


341.0 


653 


neur_chan 


Neurotransmitter-gated ion- 
channel 


4 .le-171 


581.8 


t»0*k 


tsp__l 


Thrombospondin type l domain 


4 . le-47 


169 .9 


f CQ 


FH2 


Fonmn Homology 2 Domain 


le-107 


371. 2 


661 


pou 


Pou domain - N-terminal to 
homeobox domain 


5 .3e-45 


162 . 9 


662 


C2 


C2 domain 


6 . 7e-19 


76.2 


663 


C2 


C2 domain 


6.7e-19 


76". 2 


664 


C2 


C2 domain 


6.7e-19 


76 .2 


6^7 


GST 


Glutathione S- transferases . 


9.3e-34 


ii4.4 ; 


668 


LRR 


Leucine Rich Repeat 


9.3e-3l 


115.6 


670 


spectrin 


Spectrin repeat 


4e-57 


203.2 


671 


I_LWEQ 


I/LWEQ domain 


9.5e-101 


341.0 


672 


ABC tran 


ABC transporter 


5.3e-60 


212.8 


674 


WD40 


WD domain, G-beta repeat 


4.8e-24 


93.3 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 


675 


WD40 


WD domain, G-beta repeat 


4 -Be-24 


" 93.3 


676 


LRR 


Leucine Rich Repeat 


- "0.0015 ~ 




679 


zr-CCCH 


Zinc finger C-x8-C-x5-C-x3~H 
type 


4. . oe - £. y 


107 . 7 


680 


ZI-C2H2 


Zinc finger, C2H2 type 




inn " 


681 




" Calponln homology (CHj domain 




71 . 1 


682 


~^SPo 


Dual specificity phosphatase, 
catalytic doma 


4 .3e-43 


156.6 


683 


zf-C3HC4 


" Zinc finger, C3HC4 type (RING 
finger) 




10 . 8 


687 


Synapsln 


Synapsin 


0 


1 a art a 


689 


PR55 


Protein phosphatase 2A ~~ 
regulatory subunit PR 


0 


1038 « 8 


691 


homeobox 


Homeobox domain " 


0 . ae-jy 


112 .4 


696 


Peptidase_M2 
4 


metallopeptidase family M24 


9 Cp.CQ 
& . OC"3? 


£10.5 


697 


RhoGEF 


RhoGEP domain 


J . 3B-J D 


, 

12B.9 


698 


PHD 


PHD- finger : 


ft nnn 


9.3 


701 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-123 


422.0 


702 


Sulf atase 


Sulfatase "" 


3e-231 


7B1. 6 


703 


zi:-C2H2 


Zinc finger, C2H2 type 


O ♦ ye-20 


79 . 8 


707 


Acyl_trans£ 


Acvl trannfpran^ rinmain 


1 . le-22 


88 . B 


708 


WD4 0 




4 . 8e-i9 


76 . 7 


710 


Ran_BPl 


RanflPi domain. 


B.4e-06 


-7.3 


713 


DEAD 


DRAD/DRAH hrw K^TTT^Teo — 


9 . 9e-42 


134 .9 


714 


PH 


PH domain 


1.6e-09 


39.0 


715 


DSPc 


■L/utA-L apcciiicicy pnospnatsse, 

r* a Hal \ / H ■> At hm 
uauajkytic QOiM 


1 .5e-37 


138.2 


717 


Sialyl trans £ 




7 .5e-31 


115.9 


718 


lg 




le-29 


100 .8 


719 


integrin B 




0 


1125.4 


720 


zf-C3HC4 


axnc singer, Unt* cype (RING 
finger) 


1 .le-oa 


32.4 


722 




v-wi.pttj.il tamijiy cysteine 
protease 


3e-145 


495.9 


723 - * 




Immunoglobulin domain 


2 .2e-05 


22.4 


724 


F-box 


F-»box domain. 


0 .007 


23 . 0 


725 


Nop 


Putative snoRNA binding domain 


8 .le-5B 


205.5 


726 


Nop 


rutativa snoKiv/\ omQing domain 


B , le-58 


205.5 


727 ' 


WD40 




7 .5e-26 


99.3 


730 


dsrm 


wuwie-BtranoeQ ixxit\ Dinuing 
motif 


0 .027 


12 .1 


731 


dynamin 


Dvnamin f ami \ \r 


4 . 2e-16 


66 . 9 


733 


zf-CCCH 


type 


2 .8e-10 


41.7 


735 


CDP- 

0HJ?_ trans f 


phosphatidyl transferase 


4 .2e-26 


100.1 


738 


DEAD 


DEAD/DEAH box helicase 


8 . 6e~57 


182.5 


739 


TSC22 


TSC-22/dip/bun family 


0 . ae-i^i 


119 . 5 


742 


ras 


Ras family 


2.2e-100 


346.9 


743 


PMl_typeI 


Phosphoroannosc isomerase type I 




B22 . 9 


747 


trypsin 


Trypsin 


6.4e-88 


279.4 


748 


kazal 


inhibitor domain 


Z . 2e-52 


187.4 


749 


e£nand 


EP hand 


6 . 3e-0S 


Tt — 1 


751 


PHD 


PHD- finger 


* . 7C*ib 


66 . 7 


752 T 


z£-C2H2 


zinc finger, C2H2 type 


■j - ze-^i 




753 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


6.1e-ll 


49.8 


754 - " 


Ribofiomal L3 
9 


Ribosomal L39 protein 


0.0001B 


26.7 


[755 


PH 


ph domain 


3 .6e-l4 


55.7 


758 1 - 


SCAN 


scan domain 


1.4e-53 


191.5 


"759 


PA 


PA domain T 


0.0065 


23.1 


■■7*0 


ar^ | 


ADP-ribosylation factor family 


2.2e-19 


77.8 


761 


CIDE-N 


CIDB-N domain — ~ 


2.2e-40 


147.6 
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SBQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


1 1>2 


hi st one 


Core histona H2A/H2B/H3/H4 


9.9e-53 


188.6 


/D J 


21-MlNu 


MYND finger 


4.1e-14 


60.3 




pou 


Pou domain - N- terminal to 
home ob ox domain 


le-52 


188.6 


76 7 


vwc 


von Willebrand factor type C 
domain 


2.9e-34 


127.3 


769 


3 

etnano. 


EF hand 


4 . 8e-ll 


50.1 


770 


zf-C4 


Zinc finger, C4 type (two 
domains ) 


2.4e-53 


181.6 






Has family 


7e-90 


312 .0 


/ / A 


ouJ.iaca.se 


Sulf a case 


le-142 


4B7.5 


/ / & 


zf -C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 j 


776 


Zf-C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 ; 


111 


zf -C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


778 


rrm 


RNA recognition motif. 


2.1e-32 


121.1 


779 


G6PD 


Glucose -6 -phosphate 
dehydrogenase 


1.5e-76 


236.6 


780 


spectrin 


Spectrin repeat 


3.7e-29 


110.3 


781 


mato carr 


Mitochondrial carrier proteins 


4.6e-57 


198.5 


782 


SCAN 


SCAN domain 


l,3e-24 


95.2 


783 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


4.1e-07 


37.1 


785 


DEAD 


DEAD/DEAH box he li case 


Ge-06 


21. V 


785 


ras 


Ras family 


5.3e-39 


143.0 


787 


RNase HI I 


Ribonuclcase HII 


2.5e-67 


237.1 


790 


PI3_PI4_kina 
se 


Phosphatidyl inositol 3- and 4- 
kinases 1 


5.4e-108 


372.2 


795 


cadherin 


Cadherin domain 


2.5e-40 


147.4 


79b 


ARID 


ARID DNA binding domain 


1.6e-20 


81.6 


797 


trypsin 


Trypsin 


9.9e-20 


64.8 


799 


CH 


Calponin homology (CH) domain 


3.7e-15 


63 . 8 


801 


Gal- 

bind lectin 


Vertebrate galactoside-binding 
lectin 


4.16-25 


8B.7 


803 


WD40 


WD domain, G-beta repeat 


0. 00082 


26.1 


806 


TBC 


TBC domain 


1.8e-26 


101.4 


807 


TliU 


TBC domain 


1.8e-26 


101.4 


808 


CNjhydrolase 


Carbon- nitrogen hydrolase 


8.8e-80 


"2^8. <* 


811 


CBFDjaFYBJiM 
F 


Hi stone- like transcription 
factor 


6e-14 


59.8 


812 


adh short 1 


short chain dehydrogenase 


B.le-20 


"79.3 


814 


IMP4 


Domain of unknown function 


3.3e-71 


250.0 


815 


zf-C2H2 


Zinc finger, C2H2 type 


8.26-66 


232.1 


816 


Pept_tRNA_hy 
dro 


Peptidyl-tRNA hyaroiaee 


l.tie-37 


138.0 


817 


ARID 


ARID DNA binding domain 


2.5e-lB 


74.3 


826 


IF5_eIF4_eIF 
2 


eIF4- gamma/ eIF5/eIF2-epsil on 


1.6e-32 


121.5 ' 


830 


Arfteap 


Putative GTP-ase activating 
protein for Arf 


1.5e-53 " 


191.3 - 


831 


LRR 


Leucine Rich Repeat 


2.1e-26 


101.1 


832 


lamininJ2GF 


l*aminin EGF-like (Domains III 
and vj 


2e-57 


204.2 


839 


rrm 


RNA recognition motif. 


1.3e-22 


88. 5" 


840 


Y_phosphatas 
e 


Protein- tyrosine phosphatase 


2.6C-119 


409! 8 


841 


p kinase 


Eukaryotic protein kinase 
domain 


3.4e-100 


346.3 


844 


Ribosomal L2 
2e ~ 


KxDosoniai uhq pro c em ramiiy 


ie-64 


228 .4 


846 


IBR 


IBR domain 


9e-15 


62.5 


849 


zt~C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.4e-07 


26.5 


850 


zt-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.00016 


18.9 


851 


SET " 


SET domain 


5e-30 


113.2 


852 


SRCR 


Scavenger receptor cysteine- 


0 


1025.4 
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NO: 




DESCRIPTION 


p-value 


PFAM 
SCORE 






rich domain 






B53 




Scavenger receptor cysteine- 
rich domain 


0 


1025.4 


B57 


lactamase B 


ricLaX10-oeto~laCCoIlla8£ 
aupct JLcuiuLJLy 


0 .012 


-6.0 


B58 


C0X6A 


via 


3 .4e-58 


206 .7 


B59 






5 .4e-45 


162 . 9 


861 


PRK 


Phosphoribu 1 ok i nas e 


5.ie-62 


219.4 


863 


MIX I— w JU 


Mitochondrial carrier proteins 


2 .9e-53 


185.5 


864 


HSP90 


Hsp90 protein 


4.7e-lS8 


538.5 


866 




Immunoglobulin domain 


4e-l2 


44.1 


B67 




Zinc finger, C2H2 type 


7e-135 


461.5 


872 


hi stone 


Core hlstone H2A/H2B/H3 /H4 


4 ,9e-41 


149.8 


a ha 


CPSase L cha 
m 


Carbamoyl -phosphate synthase 
(CPSase) 


2.1e-218 


739.0 


a /u 


Ribosomal Si 
2e 


Ribosomal protein S12e 


2.1e-9B 


340.3 


ooZ 




Serpins (serine protease 
inhibitors) 


2.5e-« 


145.7 


OOj 


Patatln 


Patatin 


1.2e-51 


182.0 




RA 


Ras association (RalGDS/AF-6 1 ) 
domain 


0 .044 


8.0 


OD / 


DUF92 


Integral membrane protein DUF92 


2.7e-12 


54.3 


8B9 


sugar_tr 


Sugar (and other) transporter 


8.2e-63 


222.1 


OS?.* 


DUF28 
— 


Domain of unknown function 
DUF28 


1.3e-43 


158.3 




IP_trans 


Phosphatidylinositol transfer 
protein 


£.5e-9B 


338.7 


896 


DEAD 


DEAD/DEAH box he 1 lease 


1.5e-4 8 


l5o.5 


899 




KE2 family protein 


7e-61 


215.7 


900 




KE2 family protein 


4.3e-51 


183.2 


901 


zf-C2H2 


Zinc finger, C2H2 type 


2.7e-57 


203.8 


902 


ras 


Ras family 


2.3e-75 


263.8 


904 




' TPR Domain 


3.2e-22 


87.2 


906 


flu D 


Guanylate-binding protein 


H8.9e-253 


853.1 




Ann 


Guanylate-binding protein 


l.le-239 


809.6 


908 


WD40 


WD domain, G-beta repeat 


2.6e-26 


100.8 


ana 


PH 


PM domain 


1.3e-09 


39.4 


qi n 


ZX-C2H2 


Zinc finger, C2H2 type 


2.5e-3 9 


144.1 ' 


-'I J 


Epimerase 


NAD dependent 

epimerase/dehydratase family 


5e-07 


-88.5 


921 


TBC 


TBC domain 


1 ,5e-09 


30.7 


922 


WU4U 


WD domain, G-beta repeat 


1.6e-25 


98.2 


"923 


WD40 


WD domain, G-beta repeat 


8.2e-07 


36.1 




Hydrolase 


haloacid dehalogenase-like 
hydrolase 


2.9e-05 


29.1 


~925 


uu con 


Ubiquitin-conjugating enzyme 


0.60033 


-27.6 


926 


of 


Calponin homology (CH) domain 


3 .3e-53 


190.2 


928 


WD40 


WD domain, G-beta repeat 


5.9e-48 


172.7 


"929 




Zinc finger, C3HC4 type (RING 
finger) 


3.1e-10 | 


37.4 


930 


tviom jr o cp 

im 


Ribulose -phosphate 3 epimerase 
family 


7 .2e-105 


361.8 


931 


*I<4WUJL XT J C l-J 


Ribulose-phosphate 3 epimerase 
icunxiy 


1.2e-96 


334.4 


936 


C2 


C2 domain 


2.2e-62 


220.7 


937 


NAP^family 


Nucleosome assembly protein 
(NAP) 




a a a 

oft . o 


940 


abhydrolase 


alpha/beta hydrolase fold 


0.011 


3.1 ■ ■ 


944 


Tropomyosin 


Tropomyosins 


3.2e-07 


25.1 


948 


pkinase 


Eukaryotic protein kinase 
domain 


3.4e-75 


263.2 


949 


WD40 


WD domain, G-beta repeat 


1.8e-27 


104.7 


950 


Acyl transfer 
aee 


Acyl transf erase 


1.6e-07 


38.4 
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SEQ ID 

NO: 


PFAM NAME 




p-value 


PFAM 


951 


SAM 


motif) 




14 . 5 


954 ■ 


GFO IDH MocA 


Oxidoreductase family 


1 . Je ll 


OA . 0 


955 


BTB 


BTB/POZ domain ' — ~~ 


7e-22 


PC 1 
OO . 1 


956 


BTB 


BTB/POZ domain 


7e-22 


86.1 


957 


CD?- 

OH P transf 


CDP- alcohol 

phosphatidyl transferase 


" h ne-i 


-22 . 2 


959 


ras 


Ras family 


c . ^te- ? / 


17C a 

Jjb . o 


960 


ras 


Ras family 


8.4e-43 


155.5 


961 


Acetyltransf 


AcetVl transf erase (GNAT) familvr 


i . ze-u o 


42 . 2 


962 


adh short ~ 


short chain dehydrogenase 


2.46-31 


117.6 


9*3 


mutT 


oa^bcficix niuti protein 


5 . 6e-06 


26 . 2 


969 . 


IF-2B 


initiation i&ccor x suounit 
family 


8 .4e-193 


653 .9 


970 


RNase PH 


o exorioonuc lease tamiiy 


9e-24 


92.4 


"975 


WW 


rirt domain 


5.7e-25 


96.4 


977 


PDZ 


PDZ domain (Also Known as DHR 
or vjjbur J . . 


3.6e-21 


83,7 


978 


n^JUUOUIIUll l_iX 
7 


Ribosomal protein L17 


2.4e-20 


81.0 


"979 


Um 


LIM domain containing proteins 


5.8e-42 


152.8 


980 




Cal segues trin 


1.7e-297 


1001.7 


982 


HSP20 


Hsp20/alpha crystallin family 


1.2e-10 


43.2 


983 


UAlUUicU Qb 


NADH ubiquinone oxidoreductase, 


4 .8e-63 


222.9 ■. 


988 


TBC 


TBC domain 


2.2e-50 


180.8 


989 


TBC 


TBC domain 


2.2e-50 


180.0 


993 


tRNA_int_end 


tRNA intron endonuclease 


0.0017 


-34.2 


994 






Homeobox domain 


4e-18 


73.6 


997 


nvr >-q hav 

i/y *• i =uua 


Pyridine nucleotide- di sulphide 
oxidoreducta 


0.012 


11.6 


1000 




Mitochondrial carrier proteins 


9.7e-123 


421.2 


1001 


RA ~ 


Ras association (RalGDS/AF-6) 

uutiiain 


1.2e-15 


£5.4 


"1004 


• DUP8 1 


Domain of unknown function 
DUF81 


0. 099 


10.2 


"1005 


actin 


nCLlu 


1 .3e-174 


574.3 


1006 




Actin r 


3 .le-130 


428.6 


1007 




iuf-1/cpnbU cnaperonin ramily 


3 .7e-195 


661.8 


1008 


~TPR ~ 


■* * *v uumain 


8 . le-44 


159.0 


1009 


zf -C2H2 


Ainc linger/ cznwS type 


3 . 6e-61 


216.6 


1011 




ainc ringer, v-2nZ wype 


3 . 6e-61 


216.6 


1012 - 


Zi-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


4\7e-lS 


53.1 


1016 


tKUA-synt_2c 


tRNA synthetases class II (A) 


2.3e-15 


55.2 


1018 


RhoGAP 


luiub/u* uuniain 


1 .6e-78 


274.3 


1022 


PGAM " " 


* 4«s/spuwvjAy wciam uiuuelfcit? tamiiy 


3 . 8e-18 


69.7 


1026 


HMG box 


HMG (high mobility group) box 


8.4e-20 


79.2 


1027 


TBC 


i ov» auuiain 


7 ,3e-45 


1^2.5 1 


1028 


UQ con 


Ubiquit in- conjugating enzyme 


1.4e-49 


178.1 


1032 


*p5f 


rL>£f aomain ia_lso Known as DHR 
or GLGP) . 


0 .028 


16 .3 


'1034 


Hydrolase ~ — " 


**»AVJrts*ia umidiou snasc — i ixe 
hvdrol ase* 


2e-21 


84 .6 


1037 


KRAB " 


KRAB box 


4.8e-06 


32.4 


1038 


Cation_efflu 

X 


Cation eftiux tamily 


7.1e-42 


152TS 


1040 


ART " 


NAD: arginine ADP- 
ribosyltransf erase 


4.7e-47 


169.1 


1042 


WD40 


WD domain, G-beta repeat 


I.9e-18 


74.7 


1043 


zt-C2H2 


Zinc finger, C2H2 type 


3.7e-24 


93 .7 


1045 


lectin c 


Lectin C-type domain 


1.9e-28 


108.0 


1046 


Glucosamine 
iso 


Glucosamine - 6 -phosphate 
isomerase 


0.00013 


-25.1 
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| SEQ ID 
J NOr 


j PFAM NAME 




1 p-value 


PFAM 
SCORE 


1047 






| 4.5e-80 


279 . 4 


1043 


ig 




| 1 . 7e- 09 


35 . 6 


1050 


Ribosomal L2 
4e 


Ribosomal protein L24e 


2e-33 


124.5 


1054 


Amide se 


Ami das e 


[ 4 . 3e-152 


518 .7 


1 1055 


rrm 




| 3 . Be- 26 


100 .3 


J 1058 


annexin 




| 6 . 9e-44 


159 . 2 


1059 


""PMP22_claudi 
n 




1 0.023 


-23 .6 


1060 


hpmeobox 


Homeobox domain 


| 3 . 2e-31 


117 .2 


1062 
1 1064 


Acyl trans £ er 
ase 




U . 00065 


10 . 5 




AMP -binding 


AMP— hi ndina enevmp 


| 6 . fie- 100 


345 .3 


1 1065 


LRR 


wcui«xuc Repeat 


| 3 . 3e-14 


60.6 


| 1066 


GTP1 OBG 






141 .8 


J 1071 




Immunoglobulin domain 


1 B.4e-48 


159.1 


1 1072 


PHD 


enu— tinger 


1 6, Be-07 


36.3 


1 1074 


DENN 


uaNvi i/uiA-j; oomam 


| 8.3e-33 


121.5 


1 1075 


SCP 


SCP-like extracellular protein 


j 4.7e-41 


149.8 


1077 
| 1078 


OLF 


Olf actomedin-like domain 


j 2.2e-66 


234.0 




mito carr 


Mitochondrial carrier proteins 


le-42 


149.3 


| 1079 


WD4 0 


wd domain , B-beta repeat 


| 6.2e-45 


162.7 


1007 


START 


START domain 


1 1.5e-48 


174.7 


1 10 93 
| 1094 


DSPC 


Dual specificity phosphatase, - 
catalytic doraa 


3.3e-63 


223.4 


i 1 noc 


GSHPx 


Glutathione peroxidases 


| 9.6e-41 


146.8 




DUF25 


Domain of unknown function 
DT7F25 


2e-75 


264 .0 


1096 
1 1105 


.DUF25 


Domain of unknown function 
D0F2S 


6e-75 


262.4 




iv jl t i oireauc c a 


Nitroreductase family 


1.3e-13 


58. 6 


| 1106 


PTE 


Phosphodiesterase tamily 


1.3e-179 


610.1 


1107 
I 1109 




Diacyiglycerol Jcinase catalytic 
domain 


0 . 00049 


19.6 


1 1115 " 


rss 


r Ras family 


1.3e-15 


40.7 


1 1116 


Axzuap 


Putative GTP-ase activating 
protein for Arf 


9.7e-47 


168.7 




HMGld 1*7 


HMG14 and HMG17 


4 .4e-21 


83.5 




lulul 4 ! JL / 


HMG14 and HMG17 j 


9/9Q-12 


52.4 


1119 


L l A 7% hlPf4v<n 7 s n 
Q 


Fumarylacetoacetate (FAA) 
hydrolase fam j 


2e-B3 


290.6 


1120 




Eukaryotic protein kinase 
domain 


1.4e-94 


327.6 . 


1123 




alpha/beta hydrolase fold j 


9.2e-23 


89.0 


1129 


pro_isomeras 


Cyclophilin type peptidyl- 


2.2e-56 


197.1 


(1131 


DnaJ 


UuoU aumolll | 


1.6e-30 


114.9 


1 1132 


WD40 


wd domain, G-beca repeat | 


1.3e-19 


78.6 


1 1133 
| 1134 


WD40 


raj aomain, u-oeta repeat | 


l.Be-15 


64.9 




PH 


PH domain 


0.0015 


17.8 


1136 


b *" 


Adaptor complexes medium ? 


1.2e-25S 


B66.0 


1137 


Adap comp su 
b 


naaptuL complexes medium 

subunit family | 


2.5e-209 


708 .8 


1 1139 




KdH i diaxxy 


1.5e-86 


301.0 


1141 




cuAaryucic protein Jcinase | 
domain ) 


9.4e-74 


258.4 


1152 


Acyl transfer 
ase 


Acyl transferase 


1.2e-05 


29.9 


1153 


IRS 


PTB domain { IRS- 1. type) | 


5.4e-55 


196.1 


1155 ~ 


AS 


Immunoglobulin domain ) 


1.3e-31 


106.9 


1157 


Asparaginase 
2 


Asparaginase 


6.4e-72 


252.3 


1159 


GMC_oxred 


gmc oxidoreductases | 


4.7e-142 


485.3 


1160 


zfc-ANl 


ANl-lake ainc finger 


0.00021 


27.9 
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SE q id 
NO : 


rtJwi NAME 


description 


p- value 


PFAM 
SCORE 


1163 


iiiuvci nxsco 

ne 


nnxer hi stone Hi and H5 family 


3 .8e-14 


60 .4 


116^ 


DED 


PlBS^h af f A^hnv s-Jj"iw> J »-» 
Ucdkll clJtcCwOr GOlRain 


3 .9e-05 


30 .5 


1165 


IRS 




2 . 6e-43 


157.3 


1166 


IRS 


PTB domain (IRS-l type) 


2.6e-43 


157.3 


116B 


SAM 


SAM domain ( Sterile alpha 


0 .04 


10.5 


1170 


abhy dr o 1 a se 


dxjoAid/ ocva uyoroiase cola 


0 . 098 


-7 .5 


1174 


SAP 


CUD nnmi ■{ « 

one uouiain 


3 .9e-l0 


47 .1 


1177 


PP2C 




5 . 3e-31 


112 .5 


1178 


WD40 


WD domain, G-beta repeat 


4.7e-35 


129.9 


1180 


"~EtI 


Ets-domain 


l.Be-09 


33.3 


1181 




Collagen triple helix repeat 

/ 1 O — J - - * 

(20 copies/ 


0.00016 


24.7 


1182 




iLJji/MTCPl ramily 


9.5e-56 


198.6 


1184 


RasGEP 


RasGEF domain 


1.7e-88 


307.4 ~~ 


1 1 Or 

lied 


mi to carr 


Mitochondrial carrier proteins 


1.5e-62 


217.3 


1187 


UPAR LY6 


u-PAR/Ly-6 domain 


0.0042 


15.6 


1188 


Orn DAP Arg 
dec 


Pyri doxal -dependent 
decarboxylase 


6.2e-12B 


""430. & 


1193 


Stathmin 


Stathmin family 


l.Se-90 


314.0 


1194 


Stathmin 


Stathmin family 


1.8e-90 


314.0 


1195 


Seel 


Seel family 


3.2e-183 


622.1 


1196 


pyr_redox 


Pyridine nucleotide -disulphide 
oxidoreducta 


3.1e-32 


111.8 


1197 


Glyco trans t 
8 


Glycosyl transferase family 8 


1.2e-09 


45.5 


1202 

- 


K_tetra 


K+ channel tetramerisation 
domain 


0.022 


-16.8 


1203 


adn_short 


short chain dehydrogenase 


8.3e-45 


162.3 


1206 


Ubie^_roe thyl t 
ran 


ubiB/C0Q5 me thyl transferase 
family 


1.3e-121 


417.4 


1208 


7tm 3 


7. transmembrane receptor 


7.2e-09 


29.0 


12 09 


anJc 


Ank repeat 


3.9e-l5 


63.7 


1210 


vATP— 


ATP synthase (C/AC39) subunit 


2.5e-128 


439.7 


1212 




Zinc finger, C2H2 type 


5.5e-17 


69.9 


1213 


ef hand 


BF hand 


3.2e-07 


37.4 | 


1219 


rrm 


rna recognition motif. 


2 .le-40 


147.7 


l£ <£ U 


DUF6* 


Integral membrane protein DUF6 


o/oiS 


21. 5" 


1222 




SCAN domain 


i.5e-71 


251.1 


1223 


G- gamma 


GGL domain 


3.6e-36 


129.5 


1227 


catalase 


Catalase 


0 


1158.9 


1232 


PX 


px domain 


2.2e-15 


"64.5 


1233 


PX 


PX domain 


2.2e-15 


64. £ 




FCH 


Fes/CIP4 homology domain 


3.3e-09 


44.0 




Pep t ida s e__M 2 
o ~" 


Peptidase family M20/M25/M40 


2e-63 


224.1 


1243 


WW 


WW domain 


0.044 


17.9 


"1247 


Ur« VJ UUb 


Metalioenzyroe of unknown 
function TIPF0006 


6.3e-61 


215.8 


1248 


aycos trans 
f_2 


Glycosyl transferases 


4.5e-10 


46.9 


1249 


efhand j 


EF hand 


4e-ll 


50.4 


1254 


UQ_con 


Ubiquitin- conjugating enzyme 


2.ie-73 


257.3 


1255 


£^ — 


Ras family 


2.2e-62 


220.7 


1256 


tormyl trans 
f 


Formyl transferase 


4.9e-30 


108.3 


1259 


zf-C3HC4 


Zinc finger, C3HC4 type {RING 
finger) 


5.3e-13 


46.4 


1261 "■■ 


DiHfolate re " 
d 


Dihydrofolate reductase 


2.1e-69 


241.7 


1262 


QjSTlu transp 
ept 


Gamma -glutamyl transpeptidase ' 


l.Be-110 


380.4 


1263 


PAS 


PAS domain 


1.3e-08 


36.9 


1265 


LRR 


Leucine Rich Repeat 


4.2e-22 


86.9 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1266 


SCP 


SCP-like extracellular protein 


6e-29 


108.0 


1267 


K_tetra 


K+ channel tetraraerisation 
domain 


2.8e-27 


104.0 


1269 


ras 


Ras family 


1.3e-85 


297.9 


1275 


z£-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


4.2e-10 


37.0 


1276 


abhydrolase 


alpha/beta hydrolase fold 


5.4e-23 


89.8 


1277 


abhydrolase 


alpha /beta hydrolase fold 


5.6e-21 


83.1 


1279 


trypsin 


Trypsin 


4.4e-41 


132.0 


1280 


PBP 


Phosphat idyletaanolaniine- 
binding protein 


1.3e-13 


58.7 


1285 


2f-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.6e-14 


49.6 


1287 


ank 


Ank repeat" 


1.7e-52 


187. 8 


1294 


fn3 


Fibronectin type III domain 


0.026 


20.9 


1295 


GDP 


Guanylate- binding protein 


0.00026 


-70.0 


1296 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/claudin family 


6.9e-41 


149.3 


1297 


Rhodanese 


Rhodanese -like domain 


3.2e-14 


60.7 


1298 


LIM 


LIM domain containing proteins 


5.8e-21 


79.1 


1301 


rnaseA 


Pancreatic ribonucleases 


4. 9e-43 


145.2 


1307 


rai to^carr 


Mitochondrial carrier proteins 


2.1e-53 


186.0 


1308 


WD40 


WD domain, G-beta repeat 


1.6e-17 




1316 


UPAR LY6 


u-PAR/Ly-6 domain 


7.1e-20 


75.5 


1313 


thiored 


Thioredoxin 


3.6e-05 


21.6 


1314 


Aa_trans 


Transmembrane amino acid 
transporter protein 


1.5e-67 


237.9 


1316 


trypsin 


Trypsin 


4.4e-41 


132.0 


1320 


Ribosoraal LI 
3 


Ribosomal protein L13 


3.9e-62 


219.8 


1327 


Armadillo_se 
g 


Armadillo/beta-catenin-like 
repeats 


0.0054 


23.4 


1328 


KRAB 


KRAB box - 


0.052 


-5.6 


1329 


run 


RNA recognition motif. 


2.ie-ao 


"147.7 


1330 


Bcl-2 


Apoptosis regulator proteins, 
Bcl-2 family 


0.014 


"-1.6 


1331 


PX 


PX domain 


2.1e-10 


48.0 


1333 


KRAB 


KRAB box 


1. 8e-36 


134 .6 


1334 


UPP_syntheta 
se 


Putative undecaprenyl 
diphosphate synt 


2.3e-89 

r 


310.3 


1335 


UPP_syntheta 
se 


Putative undecaprenyl 
diphosphate synt 


1.8e-59 


211.0 


1336 


DSPC 


Dual specificity phosphatase, 
catalytic doma 


1.2e-31 


118.6 


1337^ 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


2.3e-12 


54.5 


1338 


TPR 


TPR Domain 


0.00021 


28.1 


1340 


metal thio 


Metal lothionein 


0.013 


20.3 


1341 


mutT 


Bacterial mutT protein 


5. 8e-09 


36-5 


1343 


Band 41 


PERM domain (Band 4.1 family) 


1.3e-38 


122.5 


1344 


Kelch 


Kelch motif 


1.4e-44 


161.5 


1345 


Antifreeze 


Antifreeze protein 


1.2e-l0 


48.8 


1347 


3Beta_HSD 


3 -beta hydroxysteroid 
dehydrogenase/iaomera 


0.086 


-177.2 


1348 


BTB ~r 


BTB/POZ domain 


5.3e-28 


106.5 


1349 


DUF6 


Integral membrane protein DUF6 


0.033 


15.8 


1350 


myosin_head 


Myosin head (motor domain) 


"0 


1088.7 




Nramp 


Natural resistance-associated 
macrophage pro 


1.2e-202 


686.6 


1353 


S_100 


S-100/ICaBP type calcium 
binding domain 


5.3e-23 i 


89.9 


1355 


DEAD 


DEAD/DEAH box helicase 


3.6e-65 | 


209.0 


1356 


C2 


C2 domain 


2.4e-15 


64.4 


1357 


RBD 


Raf-like Ras -binding domain 


4.2e-57 


203 .1 


1360 


zr-C2H2 


Zinc finger, C2H2 type 


7.4e-141 


481.4 


1361 


HMG14 17 


HMG14 and HMG17 


■7.9e-40 


145.7 
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SEQ ID 

NO; 


PEAM NAME ' 


1 DESCRIPTION " — 


p- value 


PFAM 


1362 


SIS 






113 . 6 


1363 


SIS 


SIS domain 


1 . 3e-28 


108 . 5 


1364 


ig 






19 . 0 


1368 


K_tetra 


K+ channel tetramerisation 
domain 


1 . le- 16 


DO .7 


1371 


Collagen 


Collagen triple helix repeat 
(20 copies) 


2 . 2e-113 


J3U . 1 


1372 


Dn&J 


DnaJ domain 


6.6e-36 


132.7 


1376 


KRAB 


KRAB box 


2 . 


141 0 


1378 


ELM2 


EU42 domain 


2e-23 




"1380 


thiored 


Thioredoxin 


1 . 2e-23 


82 . 8 


1381 


ank 


Ank repeat 


2 , 3e- 83 


290 . 4 


1382 


BTB 


BTB/POZ domain * ' 




50 . £ 


1383 


WD40 




1 . DC" ±if 


78 .3 


1384 


WD40 


WD domain, G-beta repeat 




92 .9 


1387 


zf-C3HC4 


Zinc finaer t-vnp /tjtwr 
finger) 


l . le- 0 9 


35.6 


1389 


Zf-C2H2 


Zinc finaer. C2H2 tvne» 




179 . 5 


1390 


zf~C2H2 " 


Zinc fincrer. C2M2 tvr»e 


a . DC-DO 


296 . 9 ; 


1393 


kineBin 


Kincsin motor domain 


7.8e-188 


637.4 


1394 


zf-C2H2 


iixm- iiuyei , <.£n£ type 


1 . 2e-4S 


178. 4 


1398 


KRAB 




5 . le-22 


86 . 6 


1402 


bZIP 


bZIP transcription factor 


0.035 


13.1 


1405 




&>ugar iana otnerj transporter 


0 .003 


-101.5 


1406 


RhoGAP 


KaOMAt aotnain 


8 .9e-47 


168.8 


1407 




kimh recognition motit . 


le-35 


132.1 


1408' 


LRR 


Leucice Kicn xcepeac 


2 .le-13 


58 .0 


1409 


at 


jNeoujLan repeat 


6e-54 


192.6 


1410 


ank 




1.6e-17 


71. £ • 


1412 


Ribosomal L5 
C "~ 


iiDusQiMi Ltotr latmiy i-- terminus 


8 .2e-58 


205.5 


1415 


trypsin 


Trypsin 


4 .7e-85 


.270.4 


1416 


ami riot ran 1 


/am a Liu li alio J* 6 Ino c S C J. aSfl"! 


4 . 4e-05 


" -91 .2 


1417 


SI 


SI RNA binding domain 


1.6e-07 


33.1 


1419 


WD40 


WD domain, G-beta repeat 


2 .2e-09 


44 .6 


1422 


cadherin 


Cadherin domain 


8 ,3e-42 


152.3 


1424 


SH3 


on j uouiain 


2 . 5e-80 


280.3 


1425 


PHD 




3 . 2e-17 


70.6 


1426 


PHD 


jrnjL,/- 1 mger 


3 . 2e-17 


70 .6 


1427 


Arf Gap 


rucutive ijir-ase accivating 
protein for Arf 


le-37 


138 .8 


1428 




Hsl leases conserved C- terminal 

doma \ n 


le-26 


102.2 


1429 


WD40 




3 . 9e-07 


37.2 


1430 


Inositol P 


Inositol taononhofinh a h a ^am^inr 


2 . 5e— 10 


40 .2 


1431 


raito carr 


Nitochondfl al ram ov r>T**>Htt4 


4 . 3e-83 


287 . 7 


1433 


Clq 




2 . 9e-16 


66 . 2 


1434 


WD40 


WD domain, G-beta repeat 


J. . DC'IJ 


58 . 3 


1435 


Inos-l- 
P_synth 


Myo-inositol-l -phosphate 
synthase 


7e _ 2 2B 


770 . 4 


143 6 


rrm 


RNA recognition motif. 


1 . 4e-34 


12 8 3 


1438 


ig " — 


Immunoglobulin domain 


1 . 3e-12 


*to - o 


1440 


G_AdapC_CT 


Gamma -adapt in, C- terminus 


3 . 4e-67 


236.7 


1441 


G_Adapt_CT 


Gamma-adaptin, C- terminus 


3 . 4e-67 


236 . 7 


1443 


Kelch 


Kelch motif 


0 . 00013 


28 . 7 


1446 


ARID 


ARID DNA binding domain 


1 . 8e-21 


a a n 
04 . / 


1447 


zf-C2H2 


Zinc finger, C2H2 type 


9.4e-28 


105.6 


1448 


AMP-binding = 


AMP-binding enzyme 


2.6e-07 


-145.1 


1451 


rrm 


RNA recognition motif. 


6.5e-21 


82.9 


1454 


13 


Immunoglobulin domain 


5.6e-44 


146.7 


1455 


Sialyl trans f 


Siaiyitransferase family 


5.4e-21 


83.2 


1460 


Aldose_epitn 


Aldose l-epimerase 


1.9e-35 


131.2 


1461 


C2 


C2 domain 


4e-18 


73.6 


1470 


TIG 


IPT^TIG domain 


3.1e-19 


77.3 


1472 


PseudoU_synt 


SNA pseudouridylate synthase 


4.3e-16 


66.9 
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SEQ ID 
NO: 


1 PFAM NAME 




p-value 


PFAM 
SCORE 












1474 


DKNN 


DENN (AEX-3 i do-nain 


1 . 3e-44 


161.6 


1475 


Catian_e££lu 

X 


Cation efflux family 


4.6e-49 


176.4 


1477 


TBC 


TBC domain 


8e-47 


169.0 


1478 


mn 




2e-21 


84 .6 


1480 


ig 


Immunoglobulin domain 


5.5e-06 


24.3 


1484 


Telo bind al 
pha 


Telomere -binding protein alpha 
subunx 


0.028 


-225.9 


1485 " 


Zf-C2H2 


oinc x iny er, type 


1.8e-68 


240.9 " 


1466 


pkinase 


EuJcaxyotxc protein kinase 
domain 


9.5e-13 


49.9 


1488 




Helicaaes conserved C- terminal 
domain 


1.4e-15 


65.2 


1489 




Protein of unknown function 
DUF89 


0.079 


-132.4 


1490 


"~ECH 


Enoyl-CoA hydra tase/isomerase 
family 


5.2e-41 


149 . 7 


1491 


guanylate_cy 


Adenylate and Guanylate cyclase 
catalyt 


5.9e-46 


166.1 


1492 


LRR 


Leucine Rich Repeat 


3.4S-19 


77.2 


149* 


z£-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.1e-10 


36.3 


1497 
1500 


pkinase "~ 
SH3 


Eukaryotic protein kinase 

domain 

SH3 domain 


le-22 


85.8 


"1502 
1503 


homeobox 


Homeobox domain 
Homeobox domain 


9 .3e-05 

0.0B4 

0.0B4 


27 .2 
13.8 
13.8 


1505 
1506 


EOF 
UCH-2 


EQF-like domain 

Ubiquitin carboxyl- terminal 

hydrolase family 


2.7e-23 
2.7e-21 


90.8 
84 . 2 


1508 


Peptidase M2 
0 "~ 


Peptidase family M20/M25/M40 


2.86-28 


101.8 


1511 
1512 


PX 


PX domain 
Sulf atase 


1.9e-ll 


51.5 


151* 


Syntaxin 


Syntaxin 


2. 8e-35 
0.011 


130.7 " 
-62.3 


1518 - 
1520 


amino tran_3 

ig 


Aminotransf eraoeo class- I II 
pyridoxal -pho 
Immunoglobulin domain 


9.7e-106 


305.6 


1521 

1*23 
1528 


RA 

RhoGAP 
WD40 


Raa association (RalGDS/AF-6) 
domain 

RhoGAP domain 

WD domain, G-beta repeat 


0.075 
0 .615 

2.Se-05 
5.4e-24 


11.0 
13.3 

10.7 
93.1 


1535 
1538 
1539 

1540 


IMS 
FYVE 

Ocular_alb 


impB/mucB/samB family 

FYVE zinc finger ~ 

Diacylgiycerol kinase catalytic 

domain 

Ocular albinism type 1 protein 


7.8e-95 | 

3.2e-27 

6e-07 

0 


328.5 
101.5 
36.5 

1184.7 


1653 
1654 


SAP 

Amino c\ie\ris*a 

e '~ 


SAP domain 

Flavin containing amine oxidase 


6e-06 
3.2e-43 


33.2 
TS7.0 


1655 
1656 


Araino_oxidas 
e 

RhoGEF 


Flavin containing amine oxidase 


3.2e-43 
1.4e-24 


157.0 
95. 1 


1657 
16^9 

1660 


MMR HSR1 

OOi-2 

actin 


uira»c uj. unknown iunction 
uniquitin carboxyl- terminal 
hydrolase family 
Actin 


0.0011 
2.5e-ll 

6 .6e-21 


-45.5 
51.1 

69 9 I 


1661 
1662 


BAH 

vwa - 


BAH domain 

von Willebrand factor type A 
domain 


1.7e-82 
0 


2B7.5 
1909.4 


1663 


WD40 


WD domain, G-beta repeat 


1.4e-67 


237 .9 


1667 


zr-C2H2 


Zinc finger, C2H2 type 


1.3e-93 


324 .4 


1669 
1671 


Noil_Nop2_Su " 
n 

SH2 


N0Ll/N0P2/sun family 
Src homology domain 2 


1.3e-23 
5.4e-15 


84.3 ~~ 
46.9 
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SBQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1672 




cm oiwo luuuuuiaCin 
Oraanizatlon MOrii fi pri 


2.1e-18 


67. 7 


1674 


zf-CCCH 


tvne 


0 . 0025 


17 . 6 


1676 


Glyco hydro 
47 






1 . 8e-187 


636 . 2 


1677 


Glyco_hydro 
47 


Glvcosvl hydrolase familv 47 


*» • ae- /4 


259 . 5 


1660 


WD40 


WD domain, G-beta repeat 


i . i.e-4d / 


105 . 5 


1661 


WD 40 


WD domain, G-beta repeat 


l.le-27 


105.5 


1683 


MMR HSRl 


RTPflfip rtf imlrnnun fnnnh^ nn 


1 . 8e-78 


274 .1 


1691 




a. cL-LK^iixLXUJi mot.it. 


1 . 8e-37 


137 . 9 


1692 




Ki4>i recognition motit. 


1.8e-37 


137 .9 


1693 


AAA 


ATPases associated with various 


1 .3e-81 


284 .5 


1697 


Perric_reduc 
t 


Ferric reductase like 

i. ircinsineinoxane com 


B.4e-B2 


285. 2 


1698 


t 


fcttic reauctdse HKe 
t tsnBiiieuioraiiH com 


3 .5e-53 


190 .1 


1699 


Zt -C2H2 


*»iiic tinjec, L,/n^ cype 


4 . 4e-34 


126 . 6 


1700 


ar£ 


■™->ir-iijt>o3yAaLion taccor rami ly 


9e-19 


75. 8 


1702 


GTP_EFTU 


Elongation factor Tu family 


0.014 


11.4 


1703 


SCAN 


oL-AiN Qoinain 


1.8e-54 


194 .4 


1707^ 


pkinase 


Eukaryotic protein kinase 
aomam 


1.2e-88 


307.9 


1709 




WD domain, G-beta repeat 


0.0035 


24.0 


1710 




T.ieucine Rich Repeat 


1.2e-30 


115.3 


1711 





WW domain 


7.6e-12 


52. B 


1712 


ank 


Ank repeat 


4.2e-34 


126.7 


1713 




Zinc finger C-x8-C-x5-C-x3-H 
type 


2.6e-09 


38. 3 


1714 


Zt" L_L_L.tl 


Zinc linger C-x8-C-xS-C-x3-H 

type 


2.6e-09 


3B.3 


" 17*15 




Ras family 


4 .4e-41 


149.9 


1716 


xtno dox 


HWG (high mobility group) box 


B.3e-21 


82.6 


"1719 


TBC 


TBC domain 1 


l.le-45 


l£5.2 


1721 


TTT 11 


neiix-ioop-nelix DNA- binding 
domain 


9.2e-10 


45.9 


1723 


Qsroi 


Double- stranded RNA binding 
motif 


2.9e-05 


30.9 


1724 


RrnaAD 


Ribosomai RNA adenine 
dime thy la ses 


0.045 


9.2 


1725 


ClDE-if 


L.JLUit-N aomam 


5.9e-40 


146.2 


1726 


HAT 


hal \Hair-A-TPR) repeats 


2.9e-44 


160.5 


1728 


hand 


At nana 


5 . le-20 


79.9 


1733 


1 


Histone deacetylase family 


1.7e-l04 


3^0.6 


1735 


LRR " 


ucuw<t(ic men Kepeac j 


4 . 6e-34 


126.. 6 


1739 


PI-PliC-X 


r uua^od tAuyi luOS 1 1 OX — Spe C 1 CI C 
DhosDholiDaRp 


0 . 0023 


16 .1 


1743 


ras 


Ras family 


■3 * /e-xu 


-21.3 


1744 


ras 


Ras family 


1 To .m 
j * /e-iu 


-21 . 3 


1745 


RasGEP 


RasGEF domain 




176 . 9 


1746 


adh__shdrt 


short chain d&hvdrntrpna rp 


7 • le-08 


34 . 6 


1751 


z£-C2H2 


Zinc finger, C2H2 type T * 


9e-.3 9 


142 . 2 


1754 


"rn3 


Fibronectin type III domain 


-> • 36-11/1 


"iAa a 
jfio . y 


175 £ 


z£-d2H2 


Zinc finger, C2H2 type 


£ 7 a_ Q*3 


322 . 1 


1758 


rrra 


RNA recognition motif. 


0.017 


21.2 


1760 


Nop 


Putative snoRNA binding domain 


6.le-95 


328.8 


1761 * 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328.8 


17*5- 


MMR H5kl 


GTPase of unknown function 


6.4e-41 


149.4 


1769 


CN_hydrolase 


Carbon- nitrogen hydrolase 


3e-06 


-43.9 


1775 


ank 


Ank repeat 


4.1e-07 


37.1 


1779 " 


Oxysteroi Bp 


Oxysteroi -binding protein 


4.7e-56 


199.6 


1783 


RhoGEF " ~ 


RhoGEF domain f 


1.6e-23 


91.6 


| XAOGEF 


RhoGEF domain 


1.6e-23 


91.6 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


1785 


Xrtn 


RNA recognition motif. 


6.4e-14 


59.7 



TRADOCS: 1 4 1 6227. J (%CRN0 1 1.DOQ 



264 



WO 01/53312 



PCT/US00/34263 



TABLE 5 



SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


SCORE) 


1 


1-21 


0.991 


0.955 


2 


1-31 


0.995 


0 .944 


3 


1-33 


0.949 


0.736 


"4 


' 1-19 


0.970 


0 .951 


S 


1-26 


0.971 


0 -863 


"6 


" 1-26 


0.971 


0 .863 


7 


1-26 


0.971 


0 . 863 


8 


1-26 


0 .971 


0.863 


"9 


""1-46" 


0.982 


0 . 901 


10 


1-21 


0 .991 


0 . 955 


11 


1-23 


0.989 


0 . 899 


12 


1-25 


0.955 


0 . Q03 


13 


1-18 


0 . 932 


ft OC 


14 


i-is 


0 . 938 


ft R7C 


15 


1-25 


0 . 941 


0 . 611 


16 


1-17 


U . 9 l£ 


0 . 939 


17 


1-27 


0.964 


0 . 777 


18 


1-16 


0 91 d 


0 . 657 


19 


1-19 


v . jDj 


0.840 


20 


1-20 


0 . 935 


0 . 701 


21 


1-22 


W • 3 1 % 


0 . 850 


22 


1-33 


u . y b J. 


0 . B95 


23 


1-19 




0 . 959 


24 


1-31 




0 . 944 


25 


1-22 


U . 3 / O 


0 . 935 


26 


1-27 


n QQC 
U . 33t) 


0 . 928 


27 


1-24 


0 • 953 


U . 739 


28 


l-2i 


0 - 906 


n Too 

0.688 


29 


1-31 


0.986 


U . 841 


30 


1-28 


0 . 980 


0 . 893 


31 


1-19 


v • 33J 


0 . 976 


32 


1-22 


U . 330 


0 . 909 


35 


1-33 


0 . 949 


U . 73© 


36 


1-33 


w . 343 


0 . 736 


46 


1-19 


0 . 570 


0 . 951 


67 


1-25 


0 . 968 


0.848 


71 


1-18 


0 . 949 


0 . 845 


72 


1-30 " * 


0 . 991 


0 . 919 


75 


1-29 


ft QCft 
V.399 


0 . 854 


88 


1-20 


0.986 


0 . 945 


94 


1-33 


0 . 994 


0 . 943 


97 


1-46 " 


0.964 


ft CQC 


103 


1-49 


0 . 983 


0.570 


108 


1-26 


0. 978 


0 . 885 


111 


1-23 


0.989 


0 . 699 


126 


1-25 


0.955 


0.803 " " ~" 


129 


1-19 \ 


0.963 


0 . 918 


138 


1-29 


0.971 


0 . 844 


143 


1-18 


0 .914 


0 . 628 


14 B 


1-20 


0/969 


0.904 


156 


"1-25 


0.941 


0 . 811 


158 


1-22 


0.979 


0 . 927 


160 


1-17 


0.972 


0.939 


161 


1-48 


0.903 


0.571 


162 


1-25 - 


0.937 


0.729 


1*8 


1-16 


0.939 i 


0.826 


171 


1-27 


0.964 


0.777 


17B 


1-21 


0.945 


0.B25 


180 


1-27 


0.981 


0.941 


187 


1-28 


0.982 r 


0.936 


190 


1-19 


0.953 


0.840 


196 


1-22 


0.975 


0.916 


197 


1-22 


0.963 


0.936 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


SCORE) 


riearid (MEAN 
SCORED 


199 " " 


1-20 


0.935 


0.701 


200 


1-23 


0.977 


0.773 


206 


1-30 


0.984 


0.890 


207 


1-19 


0.990 


0.924 


208 


1-22 


0.974 


6.850 


"210 


1-40 


0 .940 


0.670 


211 


1-28 


0.971 


0 . 849 


216 


1-24 


0.986 


0.956 


21B 


1-33 


0.961 


0 . 895 


219 


1-19 


0 .970 


0 . 871 


221 


1-19 


0.904 


0.553 


222 ™ - 


1-21 


0 . 917 


0.555 


230 


1-19 


0 . 99i 


0 .959 ' " 


231 


1-26 


0.953 


0.800 


232 


1-25 


0.988 


0 . sirs 


239 


1-23 - ~. - . 


0 . 969 


U . O £. O 


240 " 




0 .982 


fl a c d " " 


241 


1-17 


0 . 982 


0 . 955 


245 


1-30 


0.970 


0.722 


248 


1-22 


O 97fi 
yj • zt i o 


0 . 935 


249 


1-23 


0.968 


0 . 940 


"252 


1-18 * 


0.^71 


U . 923 


261 


1-24 


n do-) 

U • 0 D J 


0 .587 


265 ™ 


1-18 


n Q"> q 


0.868 


272 


1-24 


n qci 


0 . 739 


283 ' 7 


1-21 




U.686 | 


284 


1-29 


U - 3if f 


0 . 854 


290 


1-31 




0 . 841 


302 


1-28 


n on a 


0 . 893 


304 


1-16 


0 . 907 


0 . 635 


312 


1-19 


n o o i 


0 .976 


313" 


1-17 


0 . 930 


0 .753 


323 


1-22 


0.998 


0 . 909 


324 


1-17 


0 . 982 


n qca — 


328 


1-19 




■— — ■■■ya 

U . 865 


329 


1-22 


U . Jro J 


0 . 924 


330 


1-33 


W • 3 SO 


0 . 841 


331 


1-24 


fl cin 

u . at u 


0 . 712 


332 


1-24"" 


0 . g~75 


0.881 


333 


1-19 


0.984 


0 . 941 


334 


1-20 


n nqq 


0 . 567 


335 


1-27 


U • 94£ 


0 . 813 


336 


1-20 




u . HbO 


337 


1-38 


0 . 942 


u . bsJ 


33B 


1-27 


0 . 973 


U . { 1 £. 


339 


1-36 


0 . 979 


V . ou% 


340 


1-27 


0 . 88B 


0 . 597 


343 


1-19 


0 . 971 


0 . 865 ~ " 


344 


1-22 


0 . 994 




345 


1-17 


0 . 966 


0 . 687 


346 


1-19 


0.936 


0.822 


347 


1-22 


0.963 


0 . 924 


349 


1-24 


0 . 932 


0 . 966 


351 


1-21 


0 . 918 


0 , 815 


352 


1-31 


0 . 988 


0 . 912 


354 


1-31 


0 . 9^4 


0 . 83 9 


355 


1-29 


0.932 


0.632 


356 


1-15 


0.994 


0.369 


357 


1-33 


0.935 


0.726 


360 


1-27 


0.938 


0.821 


361 


1-25 


0.954 


0.674 


362 


1-22 


0.929 


0.788 


363 


1-21 


0.881 ' 


0.715 


364 


1-33 


0.978 


0.841 


365 


1-33 


0.978 


0.641 



266 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


SIGNAL IN AMINO 
ACID SEQUENCE 


SCORE) 


Means (MEAN 


366 


1-21 


0 .916" 


0.820 


367 


1-19 


0.936 


0 .822 


368 


1-29 


0.972 


0 .874 


370 


1-24 


0 .920 


0 . 712 


371 


1-24 


0.961 


0.773 


372 


1-27 


0.919 


0 .768 


373 


1-19 


0 .986 


0 . 945 


375 


1-32 


0 .994 


0.932 


376 


1-34 


0 .987 


0 .810 


377 


" 1-17 


0 .995 


0 .950 


378 


1-49 


0 . 971 


0.749 


380 


1-20 


0.968 


0 8 74 


381 


1-20 


0 . 92B 


0.782 


382 


1-19 " 


6 .986^" 


u . ?J1 


383 


1-28 


0.965 


ft flOQ 


384 


1-39 


0 . 970 


ft C^l 
U .331 


386 


1-24 " 




0 . 881 


388 


1-30 




0 . 868 


389 " 


1-19 


ft oaA 


0 . 941 


390 


1-26 " 


ft Q71 


0.782 


192 


1-20 


ft Q D 1 


0.900 


393 


1-lS 


U . 3D O 


0.890 


394 


1_23 


O Ql T 
U . / 


0 . 701 


397 


1-22 


ft Q Q C 


0 . B54 


399 


1,46 

J, " to 


0 , 977 


0 . 698 


401 


1-20 


0 . 899 


0 .567 


402 


1-2 2 


u . SJo7 


0 . 931 


403 


2-27 " ' " 


0.992 


0 .934 


404 


1-19 '" '" 


ft QQ1 


0 . 973 


405 


1-23 


0 . 994 


0 .921 


407 


1-35 




0 . 658 


408 


1-3 9 I.-. 


0 . 976 


0 . 551 


409 


1-33 "■— 


0 . 897 


0.570 


410 


1-25 


0 . 990 


0 .962 


411 


1-38 


0 . 977 


0*827 


412 


1-20 "~ 


0 . 944 


0 .768 


413 


1-20 


0 . 988 


0.965 


414 




0 . 993 


0.638 


415 


1-23 " "~ 


0.981 


0 . 940 


417" 







0.672 


418 


1-20 


0 . 952 


0.850 


419 




ft OQC 


0.967 


420 


1-29 


ft QCC 


0.861 


421 


1-22 


ft b'flo 


0 . 785 


422 


1-48 


ft QQO 


0 . 862 


424 


1-19 


ft Q 7 4 


0 . 933 


426 


1-38 


0.942 


n K<?i 

U . DJJ 


430 


1-18 


0 . 947 


0 .595 


432 


1-33 


0.957 


0 .789 


433 


1-26 


0 . 979 


0.9 04 


434 


1-27 


0 . 962 


0 . 777 


435 


1-24 


6. 998 


0 . 977 


43<* 


1-27 


0 . 973 


0 . 772 


443 


1-15 


0 . 966 


0 . 940 


448 


1-36 


0 . 979 


0 . B04 


453 


1-41 


0 . 958 


0 . 609 


455 


1-33 


0.943 


0 .606 


457 


1-27 


0.8B8 


0.597 


462 


1-16 


0.925 


0.681 


486 


1-27 


0.972 


0.845 


495 - " 


2-24 


0.917 


0.636 


496 


1-26 


0.993 


0.890 


505 


1-20 


0.976 


0.926 


507 


J.-17 


0.966 


0.687 


510 


1-23 


0.930 


0.593 
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SEQ ID NO: 


POSITION OP 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


511 


1-23 


0.930 


0.593 


512 


1-23 


0.930 


0.593 


S15 


1-18 


0.978 


0.956 


S23 


1-19 


0-.936 


0.822 


529 


1-22 


0.963 


0.924 


54S 


1-24 


0.9B2 


0.96-6 


550 


1-30 


0.933 


0.713 


552 


1-21 


0.973 


0.912 


554 


1-23 


0.969 


0.784 


571 


1-21 


0.918 " 


0.814 


574 


1-31 " 


0.988 


0.912 


5B0 


1-39 


0.S25 


0.556 


594 


1-31 


0.974 


0.839 


608 


1-29 


0.932 


0.632 


609 


1-29 


0.932 


0.632 


610 


1-21 


0.990 


0.948 


621 


1-15 


0.994 


0.969 


623 


1-33 


0.935 


0.726 


<*S3 


1-27 


0.938 


0.827 


~S68" 


1-22 ■ " 


0.929 


0.788 


~*77 . 


1-16 


0.948 


0.807 


6B5 


1-21 


0.881 


0.715 


"6^9 


1-22 


0.97$ 


0.816 


702 


1-31 


0.968 


0.898 


707 


1-16 


0.880 


6.562 


713 


1-25 


0.966 


0.743 


718 


1-19 


0.936 


0.822 


"71*J 


1-20 


0.961 


0.824 


729 


1-29 


0.972 


0.874 


735 


1-46 


0.903 


0.598 


"746 


1-14 


0.916 


0.73 0 


74? 


1-22 1 


0.965 


0.876 


748 


1-29 


0.96B 


0 .785 


759 


1-24 


0.961 


0.773 


767 


1-27 


0.919 


0 .768 


768 


1-33 


0.900 


0.585 


773 


1-42 


0.959 


0.702 


779 


1-19 


0.986 


0.945 


797 


1-19 


0.944 


0.759 


798 


1-19 


O.900 


0.568 


820 


1-17 


"0.99S 


0.950 


827 


1-49 


0.971 


0.749 


848 


1-20 


0.968 


0.874 


864 


1-20 


0.92B 


0.782 


666 


1-19 


0.986 


0.934 


"873 


1-23 


0.948 


0.686 


861 


1-28 


0.965 


0.829 


887 


1-39 


0.970 


0.551 


927 


"1-30 


0.989 


0.868 


934 


"1-48 ' ■ 


0.988 


0.777 


939 


1-39 


0.994 


0.889 


944 


1-26 


0.971 


0.782 


950 


1-29 


0.9*7 


0.845 


963 


1-20 


0.981 


0.900 


964 


1-20 


0.B86 


0.558 


973 


1-16 


0.968 


0.890 


980 


1-34 


0.961 


0.749 


981 


1-20 


0.953 


0.B22 


984 


1-12 


0.938 


0.780 


1015 


1-22 


0.985 


0.854 


1040 


1-46 


6.977 


0.698 


1052 


1-18 


0.969 


0.842 


[1059 


1-20 


0.927 


0.867 


1065 


1-33 


0.983 


0.918 


1069 


1-22 


0.993 


0.934 
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~SEQ ID ^0: 


POSITION OF 

SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN ] 
SCORE) 


1075 


1-27 


0.992 


0.934 1 


1080 


1-19 


0.931 


0.829 "\ 


1092 


1-19 


0.991 


0.973 ~j 


1094 


1-46 


0.992 


0.653 | 


1095 


1-30 


0.974 


0.929 H 


1105 


1-23 


0.994 


0.921 J 


1123 


1-35 


0.987 


0.658 H 


1138 


1-32 


0.954 


0.613 


1140 


1-38 


0.989 


""0.789 ~ H 


1142 


1-33 


0.897 


0.570 ■ H 


1152 


1-25 


0.990 " 


0.962 -j 


1170 


1-3 B 


0.977 


0.827 


1176 


1-20 


0.944 


""0.768 - --| 


1187 


1-20 


0.988 


0.965 "| 


1189 


1-3S 


0.967 


0.839 | 


1192 


1-46 


0.993 


0.638 


1193 


1-16 


0.925 


0.710 ~| 


1197 


1-29 


0.985 


0.853 | 


1208 


1-23 


0.981 


0.940 | 


1225 


1-29 


0.941 


0.672 j 


1245 


1-19 


0.9B6 


0.967 | 


1258 


1-29 


0.965 


0.B61 H 


1265 


1-22 


0.8B9 


-0.78$ 1 


1266 


1-20 


0.944 


0.809 J 


1276 


1-48 


0.982 


0.862 j 


1292 


1-19 


0.979 


0.933 j 


1296 . 


1-21 


0.984 


0.944 j 


1297 


'1-19 


0.984 


0.953 | 


1332 


1-38 


0.942 


0 . 653 ™] 


1358 


1-18 


0.947 


0.59S [ 


1371 


1-33 


0.957 


0.789 j 


1380 


1-26 1 


0.979 


0.904 | 


1397 


1-27 


0.962 


0.777 I 


1399 


1-23 


0.997 


0.960 j 


1404 


1-24 


0.998 1 


0.977 j 


1410 


1-15 


6.946" 


0.845" . "j 


1414 


1-24 


0.913 


0.588 1 


1415 


1-19 


0 . 982 


0.929 


1416 f 


1-12 i 


0.931 


0.891 


1418 


1-30 


0.933 


6.563 -\ 


1420 


1-20 


0.B81 


0.561 "f 


1421 


1-19 


0 . 990 


0.96B j 


1423 


1-17 


0.968 


0.863 "1 


1424 


1-21 


0.B85 


0 . 591 j 


1425 


1-24 


0.913 


0.588 1 


1426 


1-24 


0.913 


0.588 J 


1428 


1-25 


0.967 


0.899 "j 


1430 


1-34 


0.977 


0.819 j 


1431 


1-28 


0.979 


0.923 1 


1432 


1-36 


0.957 


0.613 j 


1433 


1-32 


0.921 


0.753 ~j 


1434 


1-39 


0.983 


0.621 j 


1435 


1-25 


0.910 " 


0.631 j 


1436 


1-42 


0.988 


0.86B [ 


1437 


1-22 


0.998 


0.98 0 j 


1442 


1-20 


0.918 


0.753 ~| 


1448 


1-12 


0.931 


0.8 91 i 


1462 


1-18 


0.968 


0.888 H 


1490 


1-20 


0.881 


0.561 ( 


1518 


1-17 


0.968 


0.863 


1525 


1-21 


0"885 


0.591 


1547 


1-28 


0.974 


0.891 j 


1561 


1-25 


0.967 


0.899 H 


15B0 


1-17 


).923 


0.824 - "j 


"1593 T 


1-ZB 


0.979 


0.923 ""j 
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SEQ ID NO: 


POSITION OF 
SIGMAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


SCORE) 


1596 


"1-16 


0.929 


0.709 


1601 


"1-36 


0.957 


0.613 


1606 


1-22 


0.979 


0.831 


1607 


1-20 


0.974 


0 .770 


1608 


1-32 


0.921 


0.753 


1614 


"1-33 


0.969" 


0.829 


1616 


1-20 


0.959 


0.869 


1625 


1-39 


0.983 


0.621 


1632 


1-25 


0.910 


0.631 


1636 


"i-ii 


0.^97 


0.591 


"1*33 


1-42 


0.988 


0.868 


164 5 


"1-20 


0.927 


0.568 


1647 


1-17 


0.923 


0.742 


1646 


1-22 | 0.998 


0.980 J 



TRADOCS:1416234.I(%CR%01!.DOC) 
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TABLE 6 



SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


| SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1 


1797 


3573 


5359 


784CIP2 1 


1103 


2 


1788 


3574 


5360 


784CIP2 2 


2673 


3 


1789 


357.5 


5361 


784CiP2 3 


4ir> 


4 


1790 


3576 


5362 


784CIP2 4 


5556 


5 


1791 


3577 


5363 


784CIP2 5 


5562 


6 


1792 


3578 


5364 


784CIP2 6 


5562 


7 


1793 


3 579 


5365 


784CIP2 7 


5562 


e 


1794 


35B0 


5366 


784CIP2 8 


5562 


9 


1795 


3581 


5367 


784CIP2_9 


5563 


10 


1796 


3582 


5368 


784CIP2 10 


5564 


li 


1797 


3583 


5369 


784CIP2 11 


" 55^5 


12 


1798 


3584 


5370 


784CIP2 12 


5689 


13 


1799 


3585 


5371 


784CIP2 13 


5729 


14 


1800 


3586 


5372 


784CIP2_14 


5745 


15 


1801 


3587 


5373 


784CIP2 15 


5777 


i£ 


1802 


3588 


5374 


784CIP2 16 


6777 


17 


1803 


3589 


5375 


784CIP2 17 


5789 


IB 


1804 


3590 


5376 


784CIP2 18 


5792 


19 


1805 


3591 


5377 


784CIP2_19 


5804 


20 


1806' 


3592 


5378 


784CIP2 20 


5805 


21 


1807 


3593 


5379 


784CIP2 21 


5805 


22 


1808 


3 594 


5380 


784CIP2 22 


5844 


23 


1809 


3595 


5381 


784CIP2 23 


5844 


24 


1810 


3596 


5382 


784CIP2 24 


5850 


25 


1811 


3597 


5383 


7B4CIP2 25 


£867 


26 


1812 


3*98 


5384 


784CIP2_26 


5973 


27 


1813 


3599 


5385 


784CIP2_27 


5995 


28 


1814 


3600 


5386 


784CIP2 28 


5995 


29 


1815 


36-01 


5387 


784CIP2 29 * 


6005 


30 


1816 


3 602 


538B 


784CIP2 30 


6007 


31 


1817" 


3603 


5389 


7B4CIP2 31 


6007 


32 


1818 


3604 


5390 


784CIP2 32 * 


6009 


33 


1819 


3605 


r 5351 


784CIP2 3"5 - 


£012 


34 


1820 ; 


3*06 


5392 


7B4CIP2 34 


6015 


35 


1821 


3C07 


5393 


704CIP2 35 


6016 


36 


1822 


3608 


5394 


784CIP2 36 


6016 


37 


1823 j 


3609 


5395 


7B4CIP2 37 


6018 


38 


1824 


3610 


5396 


784CIP2 38 


*018 ™ 


39 


1825 


3611 


5397 


7B4CIP2 39 


6018 


40 


1826 


3612 


5398 


7B4CIP2 40 


6023 


4 1 


1827 


3613 


5399 


784CIP2_41 


6070 


42 


1828 


3614 


5400 


7B4CIP2 42 


6081 


43 


1829 


3615 


5401 


7B4CIP2 43 


6089 


44 


1830 


3616 


5402 


784CIP2_44 


6118 


45 


1831 


3617 


5403 


784CIP2_45 


6118 


46 


1832 j 


3618 


5404 


7B4CIP2 46 


6130 


4 7 


1833 


3619 


5405 


784CIP2 47 


6177 


48 


1834 


3620 


5406 


784CIP2 48 


6189 


49 


1835 


O Oil 




784C1P2 49 


6191 


50 


1836 - 


3*22 


5408 


784CIP2 50 


6204 


51 


1837 


3623 


5409 


784CIP2 51 - 


6204 


52 


1838 


3 624 


5410 


7B4CIP2_52 


6284 


53 


1839 


3625 i 


5411 


784CIP2 53 


6367 


54 


1840 


3626 


5412 


784CIP2_54 


6436 


55 


1841 


3627 


5413 


784CIP2_55 


6442 


56 


1842 


3628 


5414 


7B4CIP2 56 


6445 


57 


1843 


3629 


5415 


784CIP2_57 


6457 


58 


1844 


3630 


5416 


7B4CIP2_58 


6458 


59 


" 1845 


3631 


5417 


784CIP2 59 


£458 ■'- 



271 



WO 01/53312 PCT/USOO/34263 



SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO i 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Prinrihv 

docket number 
corresponding 
SEQ ID NO: in 
priority 
application 


. 

SEQ ID 

NO: in 

U.S.S .N. 

09/488,725 


60 


1 1B46 


3632 


5418 


784CIP2 60 


6462 


61 


1847 


3633 


5419 


784CIP2 61 


6472 


62 


1848 


3634 


5420 


784CtP2 62 


6499 


63 


1849 


3635 


5421 


784CIP2 63 


6499 


64 


1850 


3636 


5422 


784CIP2 64 


6505 


65 


1851 


3637 


5423 


784CIP2_65 


6534 


66 


1852 


3638 


5424 


784CIP2 66 


6534 


67 


1853 


3639 




784CIP2 67 


6540 


68 


1854 


3640 


5426 . 


784CIP2 68 


6550 


69 


1B5S 


3641 


5427 


784CIP2 69 


6550 - 


70 


1856 


3642 


5428 


784CIP2 70 


6592 — 


71 


1B57 


3643 


5429 


784C*P2 ii 


6645 


72 


1358 


3644 


5430 


784CIP2 72 


6671 


73 


1859 


3645 


5431 


7B4CIP2_73 


6763 


74 


1860 


3646 


5432 


784CIP2 74 


6763 


75 


1361 


3647 


5433 


7S4CIP2_7£ 


6786 


76 


1862 


3648 


5434 


784CIP2 76 


6824 


77 


1863 


3649 


5435 


784CIP2 77 


6830 


78 


1864 


3650 


5436 


784CIP2_ 78 


6831 


79 


1865 


3651 


S4i7 


784CIP2 79 


6*832 


BO 


1866 


3652 


5438 


784CIP2 80 


6834 


81 


1867 


3653 


5439 


784CIP2 81 


6834 


82 


1858 


3654 


5440 


784CIP2 82 


6835 


83 


1869 


3655 


5441 


784CIP2 83 




84 


1870 




5442 


784CIP2 B4 


6843 


85 


1871 


1 3657 


5443 


784CIP2 85 


! 6859 


86 


1872 


3658 


5444 


784CIP2 86 


6915 


87 


1873 


3659 


5445 


784CIP2_87 


6932 


88 


1874 


3660 


544 6 


784CIP2_B8 


6957 


89 


1875 


3661 


5447 


784CIP2_89 


6961 


90 


1876 


3662 


S448 


7B4CIP2 90 


6973 


91 


1877 


3663 


5449 


7B4CIP2_91 


6973 


92 


1878 


3664 


£450 


784ClP2_$3 


7007 


93 


1879 


3665 


5451 


7B4CIP2 94 


7018 


94 


1880 


3666 


5452 


7B4CIP2 95 


7019 


95 


1881 


3667 


5453 


784CIP2 96 


7020 


96 


1882 


3668 


5454 


7B4CIP2_97 


702b 


97 


1883 


™" 366"$ - 


5455 


784CIP2 9B 


7021 


98 


1884 


3670 


5456 


784CIP2 99 


7023 


99 


1885 


3671 


5457 


784CIP2_100 


7027 


100 - 


1684 


3672 


5458 


7B4CIP2JL01 


7028 


101 . 


1887 


3673 


5459 


784CIP2 102 


7029 


102 


1888 


3674 


5460 


784CIP2 103 


7031 


103 — 


1889 


3675 


5461 


784CIP2 100 


7032 


104 


1890 


3676 


5462 


784CIP2JL05 


7033 


105 


1891 


3677 


£463 


784CIP2 106* 


703k 


106 


1892 " 


3678 


5464 


784CIP2_107 


7036 


107 


1893 


3679 


5465 


784CIP2_JL08 


7039 


108 


1894 


3680 


5466 


784CIP2 109 


7043 


109 


1895 


3681 


5467 


784CIP2 110 


7044 


110 


1896 1 


3682 


54 68 


784CIP2 111 


7046 


111 


1897 


36B3 


5469 


784CIP2 112 


7054 


112 


1898 


3684 


5470 


784CIP2 113 


7061 


113 


1899 


" " 3685 


5471 


784CIP2 114' 


loll 


114 


1900 


3684 


5472 


7B4CIP2 115 


7092 


115 


1901 


3687 


5473 


7B4CIP2_116 


7094 


116 


1902 


3686 


5474 


784CIP2_117 


7106 


117 


1903 


3689 


5475 


784CIP2 118 


7107 


118 


1904 


3690 


54 76 


784CIP2 119 


7111 


119 


1905 


3691 


54 77 ' 


784CIP2 120 


7123 


120 


1906 


3692 


5478 


784CIP2 121 


7142 


121 


1907 


3693 


5479 


784CIP2 122 


7142 " 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priori ty 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


122 


1908 


3694 


5480 


784CIP2 123 


7154 


123 


1909 


3695 


5481 


784CIP2 124 


7160 


124 


1910 


3696 


5482 


784CIP2_125 


7169 


125 


1911 


3697 


5483 


784CIP2 126 


7185 


126 


1912 


3698 


5484 


784CIP2_127 


7197 


127 


1913 


3699 


5485 


704CIP2 128 


7219 


128 


1914 


*3700 


5486" 


784CIP2 129 


7226 


12$ 


1915 


3701 


5487 


784CIP2_130 


7229 


130 


1916 


3702 


5488 


7B4CIP2 131 


7234 


131 


1 1917 


3703 


5489 


784CIP2_132 


7235 


132 


1918 


3704 


5490 


784CIP2_133 


7235 


133 


1919 


3705 


5491 


7B4CIP2 134 


7238 


134 


1920 


3706 


" 5492 


784CIP2_135 


7247 


13 5 


1921 


3707 


5493 


| 784CIP2_136 


7261 


136 


1922 


3708 


5494 


784CIP2 137 


7262 


13 7 


1923 


3709 


5495 


784CIP2_138 


7267 


13B 


1924 


3710 


5496 


784CIP2 139 


7272 


139 


1925 


3711 


5497 


784CIP2 140 


7273 


14 0 


1926 


3712 


5498 . 


784CIP2 141 


7282 


141 


1927 


3713 


5499 


784CIP2 142 


728B 


142 


1928 


3714 


££00 


784(^2 143 


7291 


" 143 


1929 


3715 


5501 


I 784CIP2 144 


7293 


144 


1930 


3716 


• 5502 


784CIP2_145 


7294 


145 


1931 


3717 


5503 


784CIP2_146 


7299 


146 


1932 


3718 


5504 


784CIP2_14 7 


7300 


147 


1933 


3719 


5505 


784CIP2 148 


7312 


14 8 


1934 


3720 


5506 


784CIP2_149 


7313 | 


149 


1935 


3721 


5507 


784CIP2 150 


7315 


150 


1936 


3722 


55C8 


784CIP2_151 


7318 


151 


1937 


3723 


5509 


784CIP2_152 


7321 


152 


193 8 


3724 


5510 


784CIP2_153 


7330 


153 


1939 


3 725 


5511 


784CIP2 154 


7331 


154 


1940 


3726 


5512 


784CIP2 155 


7333 




15S 


1941 


3727 


5513 


784CIP2_156' 


7350 




156 


1942 


3728 


5514 


784CIP2 157 


7352 




157 


1943 


3729 


5515 


784CIP2 158 


7384 




158 


1944 


3730 


5516 


784CIP2 159 


7403 




159 


1945 


3731 


5517 


784CIP2JL60 


7431 




160 


1946 ■ 


3732 


5518 


784CIP2 161 


7441 




161 


194? 


3733 


5519 


784CIP2 162 


7453 




162 


1948 


3734 


"■■ 5520 


784CIP2 163~ 


?467 




163 


1949 


3735 


5521 


7B4CIP?._164 


7471 




164 


1950 


373G * 


" 5522 


784CIP2 165 


7493 




165 


1951 


3737 


5523 


7B4CIP2JL66 


7502 




166 


1952 


3739 


5524 


784CIP2 167 


7511 




167 


1953 


3739 


5525 


784CIP2 168 


7514 




168 


1954 


3740 


5526 


784CIP2 169 


7520 — 




169 


1955 


3741 


5527 


784CIP2 170 


7S41 




170 


1956 


3742 


5528 


784CIP2 171 


7570 " 




171 


1957 


3743 


5529 


784CIP2_172 


7578 




172 


1958 


3744 


5536 


784CIP2 173 


7583 ' 




173 


1959 


3745 


5531 


784CIP2 174 


7592 


174 


1960 


3746 


5532 


784CIP2_175 


7601 




175 


1961 


3747 


5533 


784CIP2 176 


7602 




176 


1962 


3748 


5534 


784CIP2 177 


7608 




177 


1963 


3749 


5535 


784CIP2 178 


7615 




178 


1964 


3750 


5536 


784CIP2 179 


7617 




179 


1965 


3751 


5537 ! 


784CIP2JLB1 


7624 




180 


1966 


3752 


" SS38 


784CIP2 182 


7626 




181 


1967 


3753 


5539 


784CIP2 183 


7640 




182 


196B 


3754 


5540 


7B4CIP2_1B4 


7641 


| 183 


1969 


3755 


5541 


784CIP2 185 


7 _ 641 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
Bequence 


SBQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number 
c o rr e spond i ng"~ 
SEQ ID NO: in 
priority 
application 


•JJaV XU 

U.S.S .N. 
09/488, 725 


184 


1970 


3756 


5542 


784CIP2 186 


7641 


185 


1971 


3757 


5543 


784CIP2 187 


7642 


186 


1972 


375B 


5544 


784CIP2 188 


7649 


187 


1973 


3759 


5545 


7B4CIP2 189 


7656 


188 


1974 


3760 


5546 


784CIP2_190 


7657 


189 


1975 


3761 


5547 


784CIP2 191 


7657 


190 


1976 


3762 


5548 


784CIP2 192 


7662 


191 


1977 


3763 


5549 


784CIP2 193 


7668 


192 


1978 


3764 


5550 


784CIP2 194 


7673 


193 


1979 


3765 


5551 


784CIP2 195 


7690 


194 


1960 


3766 


5552 


784CIP2 19* ' 7700 


195 


1981 


3767 


5553 


784CIP2 197 ™ 7709 


196 


1982 


3768 


5554 


784CIP2 198 


7736 


197 


1983 


3769 


5555 


784CIP2_199 


7737 


198 


1984 


3770 


5556 


784CIP2_200 


7744 


199 


1985 


3^*71 


5557 


784CIP2 201 


7771 


200 


1986 


3772 


5558 


784CIP2__202 


7786 


201 


1987 


37-J3 


5559 


784CIP2_203 


7791 


202 


1988 


3774 


5560 


784CIP2_204 


7797 


203 


1989 


3775 


5561 


784CIP2 205 


7806 


204 


1990 


3776 


556"2 


784CIP2 206 


7812 


205 


1991 


3777 


5563 


784CIP2 207 


7812 


206 


1992 


3778 


5564 


784CIP2_208 


781B 


207 


1993 


3779 


5565 


784CIP2 209 


7822" **~™ 


208 


1994 


3780 


5566 


7B4CIP2__21Q 


7827 


209 


1995 | 


3781 


5567 


784CIP2 211 


7830 


210 


199S 


3782 


5568 


/o4CIr , 2_ > 212 


7835 


211 


1997 


3783 


5569 


/04UXF2 2J.4 


7840 


212 


199B 


3784 


5570 


/b4(_lF2 215 


7858 


213 


. 1999 


3785 


5571 






214 


2000 


3786 


5572 




7861 


215 


2001 


3787 


5573 




/ODD 


216 


2002 


3788 


5574 


7B4CTD9 91 b ' 


/□DO 


217 


2003 


3789 


5575 


/ O't trA 4 U 


7QQC 


218 


2004 


3790 


5576 


78.4PTP9 291" 


1 R Q Q 


219 


2005 


3791 


5577 


' 784CIP2_222 


7900 


220 


2006 


3792 


5578 


784CIP2_223 


7906 


221 


2007 


3793 


5579 


784CIP2_224 


790B 


222 


200fl 


3794 


5580 


784CIP2 225 


7909 


223 


2009 


3795 


5581 


784CIP2_226 


7917 


224 


2010 


3796 


5582 


784CIP2_227 


7932 


225 


2011 ~ 


379 1 ? 


55*83 


784CIP2_22 8 


7940 ~~ 


226 


2012 


3798 


5584 


784CIP2_229 


7940 


227 


2013 


3799 


5565 


784CIP2 230 


7984 


228 


2014 


3800 


" 5586 


784CIP2 231 


7984 


229 


2015 


3801 


5587 


784CIP2 232 ' 


8001 


230 


201* ' 


3802 


5588 


784CIP2 233 


8021 


231 


2017 


3803 


5589 


784CIP2 234 


8029 


232 


2018 


3804 


5590 


784CIP2 235 


8033 


233 


2019 


3805 


5591 


784CIP2 236" 


8040 


234 


2020 


380* 


5592 


784CIP2 237 


8052 


235 


2021 


3807 


5593 


784CIP2 238 


8096 


236 


2022 


3808 


5594 


784CIP2 239 


8096 


237 


2023 


3809 


5595 


784CIP2 240 


811} 


238 


2024 


3B10 


5596- 


784CIP2 241 


8126 


239 


2025 


3811 


5597 


784CIP2_242 


8132 


240 


2026 


3812 


5598 


784CIP2_243 


B137 


241 


2027 


3813 


5599 


784CIP2 244 


8137 


242 


2028 


3814 


56-00 


784CIP2 245 


8159 


243 


2029 


3815 


5501 


784CIP2_246 


8159 


244 


2030 


3816 


5602 


784CIP2 247 


8161 


245 


2031 


3817 


5603 


7B4CIP2 248 


8176 
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SEO ID NO- 

of full- 
length 
nucleotide 
sequence 


SEQ XD 

Wn • nf 

fv . OE 

full- 
length 
peptide 
sequence 


ofc-U ID NO: 

of contig 

nuelpotido 

sequence 


SSQ ID 
NO : 

w£ cantiy 
npnh i At* 

sequence 


Priority 
docket number_ 
corresponding 
con Tn un . 4*% 

£J J» X wL -L L» Y 

ai3Dlication 
rr 


SEQ ID 
NO: in 
U.S. S.N. 

An / « no nor 

09/4B8 ,725 


246 


2032 


1 3818 


5604 


784CIP7 249 


DlJD 


247 


2033 


3819 


5605 


784Ci!P2 250 


B200 


248 


203a 


| 3820 


5606 




. 0^x2 


249 


2035 


3821 


S607 


7B4PTP7 oqo 


o22 D 


250 


2036 


3822 


5608 




fl "3 1 a 


251 


2037 


3823 


5609 




8254 


252 


2038 


3824 


! 5610 




6255 


253 


2039 


3825 


5611 


TOilPTDO TEC 


8288 


254 


2040 


3826 


5612 




8296 


255 


2041 


3827 


5613 


/D<iCJLP2_2 5B 


8329 


256 


2042 


"3828 


5614 


/o4LiP2 259 


8362 


257 


2043 


i 3829 


9DiO 


7B4CIP2 260 


8429 


258 


2044 


JOJV 


cci «• 


784CIP2 261 


8436 


259 


2045 


3 831 


CC 1 t 
30l I 


784CIP2 262 


8448 


260 


204S 


3832 


EZTS 

solo 


784CIP2 263 


8472 


261 


2047 


JO JJ 


5619 


784CIP2 264 


8502 


262 


2 04 8 


"1P**4 

JO J*! 


CCOA 


784CIP2 265 


8504 


263 


2049 


JOJ3 


5621 


784CIP2 266 


8507 


264 


2 050 


laid" 


5622 


784CIP2__268 


8509 


265 


2051 


JOJ f 


dfioi 


784CIP2 269 


8515 


266 


2052 


■5 Q-lp 
JdJO 


CCOA 


784CIP2 270 


8519 


267 


2053 


la^o 

Jo jy 


5625 


784CIP2 271 


8530 


i 268 


2054 




5626 


784CIP2_ 272 


8532 


269 


2 055 


H>o/n 


5527 


784CiP2 273 


8532 


270 


2056 


J CI1 <£ 


5628 


784CIP2 274 


8539 


271 


i 2057 


3 843 


5629 


784CIP2 275 


8541 


272 


2058 


3844 


5630 


784CIP2 276 


854 3 


273 


2 059 


3845 


5631 


7S4CIP2 277 


8593 


274 


2 060 


JOttO 


5632 


784CIP2 278 


8595 


275 


2061 




5633 


784CIP2 279 


8615 


276 


2062 


■ 3 848 


5634 


784CIP2 280 


8620 


277 


2063 


3849 


5635 


784CIP2 281 


8621 


278 




■jocfi 


dtid — 
boJo 


784CIP2 282 


8623 


279 


2065 


J O 31 


5637 


784CIP2 283 


8625 


280 


2066 


a co 


5638 


784CIP2 284 


8626 


281 


2067 


J 0 JJ 


5639 


784CIP2 285 


8628 


282 


2068 




5640 


784CIP2 286 


8629 


283 


2069 


3 855 


5641 


784CIP2 287 


8630 


284 


2070 


3856 


5642 


784CIP2 288 


B631 


285 


2071 


3 857 


3D4J 


7B4CIP2 289 


8633 


286 


2072 




5644 


784CIP2 290 


8634 


287 


2073 


3 859 


5645 


784CIP2 291 


8635 


288 


2074 


3860 




/B4CXP2 292 


B636 


269 


2075 


3861 


5647 


/o«iV-J.P2 293 


8 6*59 


290 


2076 


3862 


564 8 


/o4v_JLP2 254 


8660 


291 


2Q77 


3 863 


5649 


/bQ\_lP2 295 


8667 


292 


2078 


3864 


5650 


/B41-IP2 296 


8667 


293 


2079 


3865 


5651 




8685 


294 


2080 


3866 


5652 


7fldriP7 9675 


8805 


295 


2081 


3867 


" 5653 


<299 


8896 


296 


2082 


3868 


5654 




8978 


297 


2083 


3869 


5655 


/o9t.JLP2 301 


9046 


298 


2084 


3870 | 


5656 


/U4U.1P2 302 


9048 


299 


2085 


3871 


5657 


784CIP2 303 


9116 


300 


2086 


3872 


5658 


7B4CIP2 304 


9195 


301 


2087 


3873 


5659 


784CIP2 305 


9201 


302 


2088 


3874 


5660 


7B4CIP2 306 


9307 


303—^- 


2089 


3875 


5661 


7B4C2P2 307 


9321 


304 


20SO 


3876 


5662 


7B4CIP2 308 


9397 


3 OS 


2091 


3877 


5663 


784CIP2 309 


9405 


306 


2092 


3878 


5664 


7B4C1P2 310 


9406 


307 


2093 


3879 


5^5 


784CIP2 311 


9422 
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SEQ ID NO: 

of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length * 
peptide 
ocquence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
eppl i ca t ion 


SEQ ID 
NO: in 
U.S. S.N. 
09/4B8,725 


308 


Q ft QA 


3880 


JOOO 


7B4t.XF2_ > Jl2 


9494 


3 09 




JOOl 


Dob / 


TDAPT'D') H 1 

1 a *1 t-lPZ^J 1 J 


9512 


3 To" 


*3 ftQC 


3 882 


Db b a 


314 


9632 


"111 


«U7 r 


1QD1 


3b b y 


7o4(_lP2 315 


9661 


j -i* 




"> RQA 

JUUl 


30 /□ 


784CIP2_316 


9664 




2099 




5671 


784CIP2_31/ 


9691 




2 100 


J boo 


5672 


784CIP2 316 


$700 


Jib 


2101 


3887 


5673 


784CIP2 319 


9716 i 


Jib 


2102 


3888 


5674 


784CIP2_320 


9721 


J17 


2103 


3889 


5675 


784CIP2 321 


9870 


318 


2104 


3890 


5676 


784CIP2 322 


9887 


313 


2105 


3891 


5677 


784CIP2_323 


9923 


320 


2106 


3892 


5678 


784CIP2 324 


9938 


321 


2107 


3893 


5679 


784CIP2 325 


9964 


322 


210S 


3894 


5680 


784CIP2 326 


10007 


323 


2109 


3895 


5681 


7B4CIP2 327 


10009 


324 


2110 


3696 


5682 


784CIP2_328 


10046 


325 


2111 


3897 


5683 


784CIP2_329 


10156 


32$ 


2112 


3898 


5684 


784CIP2_330 


10276 


327 


2113 


3899 


5685 


784CIP2 331 


10283 


328 


2114 


3900 


5686 


784CIP2B 1 


152 


329 


2115 


3901 


5687 


784CIP2B_2 


167 


330 


2115 


3902 


5688 


784CIP2B_3 


205 j 


331 


2117 


3903 


5689 


784CIP2B 4 


210 


332 


2118 


3904 


5690 


784CIP2B 5 


225 


333 


2119 


3905 


5691 


784CIP2B__6 


226 


334 


2120 


3906 


5692 


784CIP2B_7 


264 


335 


2121 


3907 


5693 


784CIP2B 8 


258 


336 


2122 


3908 


5694 


784CIP2B_9 


293 


337 


2123 


3909 


5695 


784CIP2B_10 


293 


338 


2124 


3910 


5696 


784CIP2B_11 


293 


339 


2125 


3911 


5697 


784CIP2B_12 


302 


340 


2126 


3912 


5698 


784CIP2B 13 


311 


341 


2127 


3913 


5699 


784CIP2B 14 


352 


342 


2128 


3914 


5700 


784CIP2B 15 


358 


343 


2129 


3915 


5701 


784CIP2B 16 


368 


344 


2130 


3916 


5702 


784CIP2BJ17 


393 


345 


2131 


3917 


5703 


784CIP2B 18 


477 


346 


2132 


3918 


5704 


784CIP2B_19 


508 


347 


2133 


3 915 


5705 


784CIP2B 20 ' 


506 


348 


2134 


3920 


5706 


784CIP2B_21 


515 


349 


2135 


3921 


5707 


784CIP2B_22 


578 


350 


2136 


3922 


5708 


784CIP2B_23 


586 


351 


2137 


3923 


5709 


784CIP2B 24 


591 


352 


2138 


3924 


5710 


784CIP2B_25 


593 


33 J 


2139 


332b 


5711 


764CIP2B 26 


594 


1CA 


2140 


3926 


5712 


784CIP2B 27 


619 


J33 


2141 


Toot 


5713 


784CIP2B_28 


620 


JDO 


2142 


3928 


5714 


764CIP2B_29 


654 | 


/ 


■cl4 J 


3 929 


5715 


784CIP2B 30 


692 


J5g 


2144 


J 930 


5716 


784CIP2B_31 


753 


oca 

Jay 


2145 


3931 


5717 


784CIP2B_32 


758 


JbV 


2146 


3932 


5718 


784CIP2B_33 


787 


361 


2147 


■"3933" 


5719 




O J J 


362 


2148 


3934 


5720 


7B4CIP2B_3S 


836 


363 


2149 


3935 


5721 


784CIP2B_36 


870 


364 


2150 


3936 


5722 


784CIP2B_37 


891 


365 


2151 


3937 


5723 


784CIP2B_38 


891 


366 


2152 


3938 


5724 


784CIP2B_39 


921 


367 


2153 


3939 


5725 


784CIP2B_40 


924 


368 


2154 


3940 


5726 


784CIP2B_41 


932 


369 


2155 


3941 


5727 


784CIP2B 42 


942 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


NO: 

of contig 

peptide 

sequence 


Priority 

UUUlCU IlUlEUJtii 
COYT?eHT50Tlfl "5 TMT 

SEQ ID NO: in 

priority 

application 


SEQ ID 
NO: in 
U . S » S ,N . 
09/488 725 


370 


2156 


3942 


5728 


784CIP2B 43 


958 


371 


2157 


3943 


5729 


784CIP2B 44 


968 


372 


2158 


3944 


5730 


784CIP2B_4* 


992 


373 


2159 


394* 


5731 


784CIP2B 46 


1025 


374 


2160 


3946 


5732 


784CIP2B 47 


1074 


375 


2161 


3947 


5733 


784CIP2B 48 


1104 


376 


2162 


3948 


5734 


784CIP2B 49 


1114 


377 


2163 


3949 


5735 


784CIP2B 50 


1144 


378 


2164 


3950 


5736 


784CIP2B 51 


1262 


379 


2165 


39S1 


5737 


784CIP2B 52 


13 18 


380 


2166 


3952 


5738 


784CrP2B 53 


13 19 t 


381 


2167 


3953 


5739 


7B4CIP2B *4 


X J 4 0 


382 


2168 


3954 


574 0 


784CIP2B 55 


14 If? 

■IT JO 


383 


2169 


" 3955 


5741 


/ O ^± V_ J. ir <i £J O D 


1464 


384 


2170 


3956 


5742 


784CTP2B S7"" 


7 ^ tXA 


385 


2171 


3957 


5743 


fQ*k\*Xir4o Do 


16" 17 


386 


2172 


3958 


5744 


784CIP9R *59 


1794 
X f 4h 


387 


2173 


3959 


5745 


/ OtUlr^D DU 


1/28 


38B 


2174 


3960 


574 6 


/o4LIr^D Dl 


1772 


389 


2175 


3961 


5747 




1809 


390 


2176 


39~62" 


5748 




186B 


391 


2177 


3963 


5749 




1898 


392 


2178 


3964 


5750 




1926 


393 


2179 


3965 


5751 


1 04 lai tr 6B DO 


"'~io.dc 


394 


2180 


3966 


5752 




1967 


39S 


2181 


3 967 


5753 


7fi4TTD5B CR 


1995 


396 


2182 


3968 


5754 


7Rd(*'TP9Ii CO 


2005 . 


397 


2183 


3969 


5755 




2027 


398 


2184 


3970 


5756 


7a4f 1 TP9n 71 
fOn\-Xx'4& IX 


2055 


399 


2185 


3971 


5757 




2103 


400 


2186 


3972 


5758 


784(?.IP9R 71 


2106 


401 


2187 


3 973 


5759 


7fi4TTP?R 74 

fV±\mXC£0 1 H 


2166 


402 


2188 


3974 


5760 


7§d^P2B 7S 


/b 


403 


2189 


3 975 


5761 


( O'iy* Xtr&D so 


2176 


4 04 


2190 


3976 


5762 j 


7B4f*TP9li 7A 


44 JO 


405 


2191 


3977 


5763 


7fl4fTP9B 70 
' Q'x\,XC6D / If 


2250 


406 


2192 


3978 


5764 


7RAPTD5R Bfl 
1 ot\^xtr ko du 




4 07 


2193 


3979 


• 5765 


784CIP2B fll 




40B 


2194 


3980 


5766 




2340 


409 


2195 


3981 


5767 


7Q4CIP2B fll 


9** 71 

«J /A 


410 


2196 


3982 


576B 


784CIP9B fld 




411 


2197 


3983 


5759 


' Ul ^ ± JT £t O M J 


cull 


4i2 


2198 


3984 


5770 


784CIP2B 86 


949 R 


413 


2199 


3985 


5771 


7B4CIP2B 87 


94m 


414 


2200 


3986 


5772 


764CIP2B 88 


2439 


415 


2201 


3987 


5773 


784CIP2B 89 


2447 


416 


2202 


3988 


5774 


784CIP2B 90 


2461 


417 


2203 


3989 


5775 


784CIP2B_91 


2487 


418 


2204 


3990 


5776 


784CIP2B 92 


2492 


419 


2205 


3991 


5777 


784CIP2B 93 


2512 


| 420 


2206 


3992 


5778 


784CIP2B 94 


2564 


421 


2207 


3993 


5779 


784CIP2B 95 


2678 


422 


2208 | 


3994 | 


5780 


784CIP2B 96 


2816 


423 • 


2209 


3995 


5781 


784CIP2B 97 


2818 


424 


2210 


3996 


5782 


784CIP2B 98 


2819 j 


425 


2211 


3997 


5783 


7B4CIP2B__99 


2943 


426 


2212 


3998 


5784 


7B4CIP2B__lO0 


3137 


427 


2213 


3999 


5785 


784CIP2B_101 


3137 " 


428 


2214 


4000 


5786 


784CIP2B 102 


3160 


429 


2215 


4001 


5787 


784CIP2B 103 


3323 


430 


2216 


4002 


5788 


784CIP2B_104 


33*0 


431 


2217 


4003 


5789 


784CIP2B 105 


3362 
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. SEQ ID NO: 

of full- 
length 
nucleotide 
sequence 


NO: of 
full- 
length 
peptide 
sequence 


eon tt\ xrn. 
of eonhia 

nucleotide 
sequence 


SEQ XD 
NO : 

of frontier 

□eDtide 

sequence 


Priori ty 
docket number^ 

SEO TD NO* in 

priority 
application 


SEQ ID 
NO: in 
n q c i\i 


432 


2218 


4004 


5790 


784CIP2B 106 


3417 


433 


2219 


4005 


5791 


7B4CTP2B_107 


3418 


434 


2220 


4006 


5792 


784CIP2B_10B 


3442 


435 


2221 


4007 


5793 


784CIP2B_109 


3442 


43* 


2222 


4008 


5794 


784CIP2B 110 


3444 


437 


2223 


4009 


5795 


784CIP2B 111 


3855 


438 


2224 


4010 


5796 


784CIP2B 112 


3883 


439 


2225 


4011 


" 5797 ■ 


784CIP2B 113 


4 090 


440 


222* 


4012 


5798" 


784CIP2B 114 


4105 


441 


2227 . 


4013 


5799 


784CIP2B 115 


4 142 


442 


2228 


4014 


5800" 


784CIP2B 116 


41 47 

t l*i j£ 


443 


2229 


4015 " 


5801 


784CIP2B 117 


HI AO 


444 


2230 


4016"" " 


5802 


784CIP2B 11B 


ZToc 


445 


2231 


4017" 


58 03 


784CIP2a HQ 


A 1M 


446 


2232 


4018""" 


5804 


784CIP2B 120 




44 7 


2233 


4019 


"■■ 5"805" 


7Rd"rfWR T71 
l a*±\~X tr£o XJLX 


*k J 04 


44 8 


2234 


4020 


5806 


— 7BArtb9R in — 


4306 


449 


2235 


4021 


5607 


7flAPTD , 5n i m 


4311 


450 


2236 


4022 


5808 


7fi4TTP5B 17d 
f O *± L.1 e£0 Xj£ H 


4.521 


451 


2237 


4023 


5809 




4323 


452 


22 38 


4024 


58l5~ 


7fl^PtD9H T5£ 


4332 


453 


2239 


4025 


5811 




4488 


454 


2240 


4026 


5612 




4588 


455 


2241 


4027 


5813 




5569 


456 


2242 


4028 


5814 


7ft4<" 1 TP7P. "l^O 


5573 


457 


2243 


4029 


5815 




5577 


458 


2244 


4030 


5816 




5579 


455 


2245 


4031 


5817 






460 


2246 


4032 


5818 


7ft4CTD7n lid 


5583 


461 


2247 


4033 


5819 


7ft4PTD7n lie 


bbu& 


462 


2248 


4034 


5820 




5585 


463 


2249 


4035 


5821 


7flAPTU7n 1 ii 
/04UJk*r«J9 J. j / 


5591 


464 


2250"" 


4036 


5822 




5593 


465 


2251 


4037 


- 5823 




5594 


466 


2252 


4038 


5824 


7fl4r , T'D7T4 IAD 


5594 


467 


-2253 


4039 


5825 




55*98 


468 


2254 


4040 


9D« D 




5602 


469 


2255 


4041 


5827 


7BdPTD7n 14"] 


bb Ob 


470 


2256 


4(542 


5828 




5608 


" 4*1 


2257 


4043 


5829 




5617 


472 


2258 


4044 


5830 


7R4PTP9R Idd 


Do Z \J 


473 


2259 


4045 


5831 




CC77 
30Z4 


474 


2260 


404* 


5832 


7B4CIP7R Ida 


eroi 
SO Z J 


475 


2261 


4047 


5833 


764CIP2B 149 


5624 


" 476 


2262 


4048 


5834 


784CIP2B ISO 




477 


™ "2263 


4049 


5835 


784CIP2B 151 


5627 


478 


2264 


4050 


5836 


784CIP2B 152 


5*28 


479 - 


. 226* 


4051 


5837 


784CIP2B 153 


5630 


460 


2266 


4052 


5838 


784CIP2B 154 


5632 


481 


2267 


4053 


5839 


784CIP2B 155 


5640 


482 


2268 " 


4054 


5840 


7B4CIP2B 1^6 


5641 


■ 483 


22*9 


4055 


5841 


784CIP2B 157 


5643 


484 


2270 


4056 


5842 


784CIP2B 158 \ 


5*47 


4B5 


2271 


4057 


5843 


784CIP2B_159 


5649 ■ 


486 


2272 


4058 


5844 - 


7B4CIP2B_160 


5658 


487 


2273 


4059 


5845 


784CIP2B_161 


5659 


488 


2274 


4060 


5846 


784CIP2B 162 


5667 


489 


2275 " 


4061 


5847 


784CIP2B 163 


5672 


490 


2276 


4062 


■ 5848 


784CIP2B 164 


5674 


J491 


2277 


4063 


5849 


784CIP2B 165 


5*78 


492 


227B 


4064 


5850 


784CIP2B 166 


5680 


493 


2279 


4065 


5B51 


7B4CIP2B 167 | 


5684 
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oct\J XL) ML) : 

of full- 
length 
nucleotide 
sequence 


SEQ ID 
wu . oi 

length 

peptide 

sequence 


SEQ ID NO ; 
of contig 

mini snh ^ Ho 

OCW|U6UUw 


SEQ ID 
no : 

oi contig 

nan t" 1 rf O 
yep LAUc 

sequence 


Priority- 
doc Ice t number_ 
corresponding 
o£tw xu riu. in 
nriori tv 
amplication 


SEQ ID 
NO: in 
U . S . S .N . 

AQ/jUQQ lie 

09/ 4ob # 725 


454 


2280 


4066 


5852 


784CIP2B 168 


5686 


495 


2281 


4067 


5853 


784CIP2B 1G9 


3 O 


496 


22B2 


4068 


5854 




3D JO 


497 


2283 


4069 


5855 


784CIP2B 1 71 

* uiUAr £s X /X 




49B 


2284 


4070 


5856 


7R4C'TP9P, 1 79 


3 / JL2 


499 


2285 


4071 


5857 


7B4CIP2B 171 


£71 Q 
3 / I? 


500 


2286 


4072 


5856 


784CIP2B 174 


Q79 A 


501 


2287 


4073 


5*8^9 


"~7B4&"iP2B ns 


C999 
3 / 


502 


2288 


4074 ' '"' 


5860 




573 0 


503 


2289 


4075 


5861 




5734 


504 


2290 


4076 


5862 




$738 


" 505" 


2291 


27V77 


Cflfii 


/B4L1P2D 179 


5739 


"5~0V" 


5292 


4078 




70A(^TT110 i on 


5740 


507 




A n*7Q 


bOOD 


784CIP2B 181 


5744 


508 


2294 


Ann a 


5866 


784CIP2B 182 


5748 


509 


2295 


SUB A 


bob / 


784CIP2B 183 


5749 


510 


2296 




daco 

bo DO 


7B4C1P2B 184 


5750 


511 


2297 




CCfft 

b c 


784CIP2B 185 


5750 


512 


2298 




5870 


7B4CIP2B 186 


5750 


513 


99 QQ 


4085 


5B71 


784CIP2B 187 


5761 


514 


23 00 


1 4086 


5872 


7B4CIP2B 188 


5762 


5 XS 


23 01 


4087 


5873 


784CIP2B 189 


5767 


516 " 


23 02 


4088 


5874 


7B4CIP2B 190 


5773 


517 


23 03 


4089 


5875 


784CIP2B 191 


5783 


518 


2304 


a aqa 


5876 


784CIP2B 192 


5784 


519 


2305 




5877 


784CIP2B 193 


5788 


520 


Z JUb 


4092 


5878 


784CIP2B 194 


5798 


521 


2307 




5879 


784CIP2B 196 


5807 


522 


2308 


4094 


C Q D K 

boo Q 


784CIP2B 197 


5818 


523 


2309 




5881 


784CIP2B 198 


5819 


524 


23 10 




5882 


7B4CIP2B_199 


5827 


525 


2311 


4097 


5883 


7B4CIP2B 200 


5828 


526 




4098 


5884 


784CIP2B 201 


5842 


527 


2313 


aaqq ' 


5885 


784CIP2B 202 


5853 


528 


A J X'i 


4100 


5886 


784CIP2B 203 


5861 


529 


23 15 


4101 


5887 ; 


784CIP2B 204 


5864 


530 


2316 


A 1 A^ 


5888 


784CIP2B 205 


5865 


531 


2317 


4103 


5BB9 


784CIP2B 206 


5871 


532 


23 IB 


4 104 


5890 


784CIP2B 207 


5873 


533 


2319 


4 105 




to/ptdid *> r\ a 
/oSLirib 2Uo 


5873 


534 


2320 


4106 






5875 


535 


2321 


4107 


5893 


/o**uxr2o 2xU 


5878 


53* 


2322 


4108 




/oaUXP2B 211 


5879 


537 


2323 


4109 


5895 


/ D%L>XJr21> 2X2 


5880 


538 


23 24 


4110 


5696 


7R4<" , TD9n Oil 


5880 


539 


2325 


4111 


5B97 




5880 


" 540 


" 2326 


4112 


5898 


7R4r , TD9R nc 


5880 


541 


2327 


4113 


5899 


7H4r , TP9fc 9 1 


3003 


542 


2328 


4114 


5900 


TfldrTD5B 9 1*7 
/O vV>Xf £l / 


5895 


543 


2329 


4115 


5901 


7R4r , TD9H 91 a 


D 0 3 O 


544 


"* 2330 


4116 


5902 


7R4<"»TD9R 910 


5902 


545 


" 2331 


4117 


5903 


7R4r*TD9R 99 A 


can/ 
33U4 


546 


2332 


4118 


5904 


r 7R4r , TT39R ill 
/04LirZD 221 


33XO 


547 


2333 


4119 


5905 j 


784CIP2B 222 


5921 


548 


2334 ' 


4120 


5906 


784CIP2B 223 


5927 


549 


2335 


4121 


5907 


784CIP2B_224 


5932 


550 


2336 


4122 


5908 


784CIP2B 225 


5939 


551 


2337 


4123 


5909 


784CIP2B_226 


5945 


552 


2338 


4124 


5910 


784CIP2B_227 


5946 


553 


2339 


4125 


5911 


784CIP2B_228 


5947 


554 | 


2340 " ' 


4126 


5912 


784CIP2B 229 


5956 


555 


2341 


4127 


5913 


784CIP2B 230 


5967 
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SEQ ID NO: 
of full- 


SBQ ID 
NO: of 


SEQ ID NO: 
of con tig 


SEQ ID 
NO: 


Priority 
docket number^ 


SEQ ID 
NO: in 


length 
nucleot ide 


full- 
length 


nucleotide 
sequence 


of contig 
peptide 


corresponding 
SEQ ID NO: in 


U.S. S.N. 
09/488,725 


sequence 


peptide 
sequence 




sequence 


priori ty 
application 




ccc 

556 


2342 


4126 


5914 


784CIP2B 232 


597b 


cFj 


2343 


4129 


5915 


784CIP2B 233 


1 5977 


558 




2344 


4130 


5916 


784CIP2B 234 


5978 


559 


2345 


4131 


5917 


784CIP2B 235 


5979 


560 


2346 


4132 


5918 


784CIP2B 236 


5980 


561 


2347 


4133 


5919 


784CIP2B 237 


5988 


552 


2348 


4134 


5920 


784CIP2B 238 


5989 


563 


2349 


4135 


5921 


764CIP2B_239 


5991 


564 


2350 


4136 


5922 


784CIP2B_240 


5997 


565 


2351 


4137 


5923 


784CIF2B_241 


5998 


| 566 


2352 


4138 


5924 


784CIP2B_242 


6003 


i 567 


2353 


4139 


5925 


784CIP2B_243 


6004 


568 


2354 


4140 


5926 


784CIP2B 244 


6013 


569 


2355 


4141 


5927 


784CIP2B_245 


6028 


570 


2356 


4142 


5928 


7B4CIP2B 246 


6028 


571 


2357 


4143 


5929 


784CIP2B 247 


6029 


572 


2358 


4144 


5930 


784CIE2B_248 


6031 


573 


2359 


4145 


5931 


784CIP2B 249 


6031 


| 574 


2360 


4146 


5932 


7B4CIP2B 250 


6032 


575 


2361 


4147 


5933 


784CIP2B_251 


6037 


576 


2362 


4148 


5934 


784CIP2B__252 


6037 


577 


2363 


4149 


5935 


784CIP2B_2S3 


6043 


578 


2364 


41*0 


5936 


784CIP2B 254 


6044 


579 


j 2365 


4151 


5937 


784CIP2B 255 


6046 


580 


2366 


4152 


5938 


784CIP2B__2S6 


6048 


581 


2367 


4153 


593 9 


784CIP2B_257 


6049 


582 


2368 


4154 


5940 


784CIP2B 258 


6651 


583 


2369 


4155 


5941 


784CIP2B_259 


6053 


584 


2370 


4156 


5942 


7 84CIP2B_260 


6060 


585 


2371 


4157 


S943 


784CIP2B_261 


6063 


586 


2372 


4158 


5944 


784CIP2B 262 


6066 


587 


2373 


4159 


5945 


784CIP2B 263 


6067 | 


588 


2374 


4160 


5946 


784CIP2B_264 


6068 


589 


2375 


4161 


5947 


784CIP2B_265 


6073 


590 


2376 


4162 


5948 


784CIP2B_266 


6076 


591 


2377 


4163 


' 5949 


784CIP2B 267 


6076 


592 


2378 ! 


4164 


5950 


784CIP2B 268 


6077 


593 


2379 


4165 


5951 


784CIP2B_269 


6079 


S94 


2380 


4166 


5952 


7B4CIP2B_270 " 


6082 


595 


2381 


4167 


5953 


784CIP2BJ272 


6088 


596 


2382 


4168 


5954 


784CIP2B 273 


6091 


597 


2383 


4169 


5955 


784CIP2B_274 


6094 


598 


2384 


4170 


5956 


784CIP2B 275 


6101 


599 


2385 


4171 


5957 


784CIP2B 276 


6103 


600 


2386 


4172 


5958 


784CIP2B_277 


6104 


601 


2387 


4173 


5959 


784CIP2B_278 


6108 


602 


2388 


4174 


5960 


784CIP2B 279 


6112 


603 


2389 


4175 


5961 


784CIP2B_280 


6121 


604 


2390 


4176 


5962 


784CIP2B 281 


6125 


605 


2351 


4177 1 


5963 


784CIP2B_282 


6126 


606 


2392 


4178 


5964 


784CIP2B_283 


6128 


607 


2393 


4179 


5965 


784CIP2B 284 


6129 


608 


2394 


4180 


5966 


784CIP2B 285 


6133 


609 


2395 


4181 


5967 


784CXP2B i _286 


6133 


610 


2^6 


4182 


5968 


784CIP2B_287 


6135 


611 


2397 


4183 


5969 


784CIP2B 288 


6139 


612 


2398 


4184 


5970 


784CIP2B_289 


6l4l 


" " 613 


2399 


4185 


5971 


784CIP2B 290 


6145 


614 


2400 


4186 


5972 


7B4CIP2B 291 


6146 


615 


2401 


4187 


5973 


784CTP2B 292 


6148 


616 


2402 


4188 


5974 


784CIP2B_293 


6149 


6X7 


2403 . 


4189 


5975 


784CIP2B_294 


6149 
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of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO * of 

full- 
length 
peptide 
sequence 


ooy Xu NO: 

nucleotide 
sequence 


SEQ ZD 
of nnn t* i a 

peptide 
sequence 


Priority 
docket number^ 

cuiiesponaincj 
SEO ID NQ» in 
priori ty 
application 


SEQ ID 
NO: in 

FT C C XT 


618 


2404 


4190 


5976 


" 784CIP2B 295 


6153 


619 


2405 


4191 


5977 


7B4CIP2B 296 


6159 


620 


240* 


4192 


£978 " 


784CIP2B 297 " 


6164 


621 


2407 


4193 


5979 


784CIP2B 298 


6167 


622 


2408 


4194 


5980 


784CIP2B 299 


6172 ~~~ 


623 


2409 


4195 


5981 


784CIP2B 300 


6173 


624 


2410 


4196 


£982 


784CIP2B 301 


6190 


62$ 


2411 


4197 


5983 


784CIP2B 302 


6194 


626 


2412 


4190 


5984 


784CIP2B 303 


6196 


627 " 


2413 


4199 


5965 


■ 784CIP2B 304 


DX7 f 


628 


2414 


4200 


5966 


784CIP2R" 105 


6198 


629 


2415 


4201 


5987 




CI Q Q 
0170 


630 


2416 


4202 


5988 




b214 


631 


2417 


4203 


5989 


/ Oi^.xr<o JU7 


6i c 

bZlb 


632 


2418 


4204 


5990 




b219 


633 


2419 


4205 




/<?<tdr2B 311 


6226 


634 


2420 


4206 


5992 




6229 


635 


2421 


4207 




HO. A /-"TtJOtJ 111 

/04CXF2B 313 


6234 


636 


2422 


4208 


5994 


7P^PTD9D 11/1 


co->*7 

523 7 


637 


2423 


4209 


599$ 




6238 


638 


2424 


42X0 


5996 


/osUir^o Jib 


6239 


639 


2425 


4211 


CQQ-l 

977 f 


/U4CIF2B 317 


6239 


640 


2426 


4212 


J770 


7o4CxP2B 318 


6239 


i 641 


2427 


4213 


7777 


/o4L>XJr2B il? 


6240 


642 


2428 


4214 


coon 




6244 


643 


2429 


4215 


com 


/B*lUxr > 2B 321 


. 6245 


644 


2430 


4216 




/B4CIP2B 322 


6250 


645 


2431 


4217 


6 003 




6252 


; 646 


2432 


4218 


c nn/t 


/o4CIP2B 324 


6252 


647 


2433 


4219 


6005 




6256 


648 


2434 


4220 


6006 


/04CIP2B 326 


6260 


649 


2435 


4221 


6007 


/o*i t^Xc^n 32 / 


6261 


650 


2436 


4222 


6008 


/o^LlrZh J2H 




651 


2437 


4223 


cnno 


/d4CIP2B 329 


6265 


652 


2438 


4224 


cm n 

O UAU 


/o**\~xr2B 330 


6266 


653 


2439 


4225 


' con " 


/o*iuxr2B 331 


6270 


654 


2440 


4226 


6 012 


*7n Af^V q*3*q *a *i o " 
/04L.AP2B 332 


6271 


SoS 


2441 


4227 


6013 




6274 


656 


2442 


4228 




l D<l(.lr£0 337 


6276 


657 


2443 


4229 


6015 




62B1 


658 


2444 


4230 


6016 t 


70Ar , TD*>Q "81*7 


OQ1 """" 


4*9 


2445 


4231 


£017 


•7Q APT D*>D lift 


b2bo 


660 


2446 


4232 


6018 


iO*awlr£D JJ7 


C "5 Q**> 
0£7£ 


661 


2447 


4233 


6019 






662 


2448 


4234 


6020 


/ a *■* ^- J. Jr ^ t5 04j 


C"» 1 *5 

D J 1Z | 


663 


2449 


4235 


6021 




CI 1 *"> 
OJ X<r! 


664 


2450 


4236 


6022 




c"*i io " 


665 


2451 


4237 


6023 


784CIP2B 34S 


6322 


666 


2452 


4238 


6024 


7B4CIP2B 347 


6324 


667 


2453 


4239 


6025 


784CIP2B 349 


6329 


648 


2454 


4240 


6026 




6331 


669 


24S5 


4241 


6027 


784CTP2R I*?! 


6333 


670 


2456 


4242 


6028 


> oi V- J. r -ZD J3/ 


6334 


671 


2457 


4243 


6029 


784CIP2B 353 


6337 


672 " 


2458 " 


4244 


6030 


784CIP2B 354 


6339 


673 


2459 


4245 


6031 


784CIP2B_355 


6346 


674 


2460 " " 


4246 


6032 


784CIP2B 356 


6348 


675 


2461 


4247 


6033 


7B4CIP2B 357 


6*348 


67** 


2462 - ■ 


4248 


6034 


784CIP2B 3S8 


6350 


677 


2463 


4249 


6035 


784CIP2B 359 


6351 


678 


2464 


4250 


6036 


784CIP2B 360 


6355 


679 


2465 


4251 


Ci037 


784CIP2B 361 J 6362 
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SEQ Tn NO • 

of full- 
length 
nucleotide 
sequence 


i)tit\i XD 

NO ; of 
full- 
length 
peptide 
sequence 


nucleotide 
sequence 


SBQ XD 

peptide 
sequence 


Priority- 
docket nuraber_ 

LUITBSponulny 

prioiri ty 
appl i ca t ion 


SEQ ID 
NO: in 
U . S . S .N . 


680 


2466 


42S2 


6038 


784CIP2B 362 


6366 


681 


2467 


4253 


6039 


784CIP2B 363 


6369 


682 


2466 


4254 


6040 


784CIP2B 3 6*4 


6371 


683 


2469 


42SS 


6041 


1 784CIP2B 365 


6376 


684 


2470 


4256 


6042 


784CIP2B 366 


6379 


68S 


2471 


4257 


6043 


784CIP2B 367 


6380 


686 


2472 


42SB 


6044 


784CIP2B 3*8 


6381 


687 


2473 


' 4259 


■fe645 


784CIP2B 369 


6392 


£88 


2474 


4260 


6046 




C1DC 


689 


2475 


4261 


6047 


784CTP7R **71 


CI Q"7 

o j y / 


690 


2476 


4262 


6048 


7S4CTP2B 379 


c" a hri 


691 


2477 


4263 


1 6049 


rbfiLXirZo J/J 


6401 


692 


2478 


4264 


6050 


'Ul^ir^o J74 


6411 


693 


2479 


4265 


GG51 




6411 


694 


2480 


4266 




7D/IPTDTD -JT£ 

/B4C1PZO jvo 


6411 


695 


24B1 


4267 


dn<-\ " 


/84UIP2B 377 


6416 


69$ 


2482 


4268 




/B4CIP2B 378 


6418 


697 


2483 




CACr 


784CIP2B 379 


6422 


698 


24B4 


4270 


Cncc. 

OU3 O * 


784CIP2B 380 


6423 


699 


2485 


4271 




784CIP2B 381 


6426 


700 


2486 


4272 


6058 


/84CIP2B 382 


6427 


701 


2487 


/LOT! 


6059 


784CIP2B_383 


64 28 


702 


243 B 


4274 


cr\cr\ 

oUbU 


784CIP2B 384 


6429 


703 


2489 


4275 


DU01 


. /o4L.XP2B 385 


6430 


704 


2490 


**A /© 




/84CIP2B 386 


6432 


705 


2491 


— 4977 


bUbJ 


784CIP2B 387 


6432 


706 


2492 


4278 


6064 


784CIP2B 3B8 


6438 


707 


2493 


/J* 


6065 


784CIP2B_389 


6441 


708 


2494 


4280 


^066 


'04L.1.F2B 390 


6446 


709 


2495 


42B1 


QUO / 


7B4CIP2B 391 


6454 


710 


2496 


4 282 


Ovbb 


784CIP2B 392 


6459 


711 


2497 


4283 


o Ub y 


784CIP2B 394 


64 61 


712 


2498 


St AO^t 


*Cfi7n 

bo /u 


/U4CIP2B 395 


64 67 


713 


2499 


4 2 85 


COT 1 


/84CIP2B 396 


6468 


714 


2500 




6072 


784CIP2B_397 


6487 | 


715 


2501 


4287 **' 


6073 


784CIP2B 398 


6491 


716 


2502 


4288 


6 074 


/S4CIP2B 399 


6506 


717 


2503 


4289 


" 6>07~5" 


/B4(_IP2B 401 


6514 


718 


2504 


4290 




/ B4C1F2B 402 


6519 


719 


2505 


4291 


6077 


/ U4L.XP2B 4 03 


6521 


720 


2506 


4292 


6078 


"7 ha rTD^n /in/i 


6532 


721 


2507 


42^3 


6079" " *" 




_^ — — — 
6536 


722 


2508 


4294 


6080 


'9?i_Xtr^J3 4Ub 


6543 


723 


2509 


4295 


6081 




6544 


724 


2510 


4296 


5082 


TfUlPTD^P 4 ftp 


654 8 


725 


2511 


4297 


6083 


7fl4r"TP7Pl dnq 


CCC1 


726 


2512 


4298 


6084 


7A4PTP5B 41fl 


D33i 


727 


2513 


4299 


6085 


7B4CIP2R 411 


6552 


728 


2514 


4300 


6086 


7B4CIP5B 417 


bDJl 


729 


2515 


4301 


6087 


7B4CIP2R 413 i 


f>556" 


730 


2516 


4302 


6088 


784CIP2B 414 


6560 


731 


2S17 


4303 


6089 


7B4CTP9R 4 1 


fisS^ 

ODD J 


732 


251B 


4304 


6090 


7ft4r*TPon aic 




733 


2519 


4305 


6091 


784CIP2B_417 


6567 


734 


2520 


430* 


6092 


784CIP2B 418 


6573 


735 


2521 


4307 


6093 


784CIP2B_419 


6575 


736 


2522 


4308 


6094 


784CIP2B 420 


6577 j 


737 


2523 


4309 


6095 


784CIP2B_421 


6593 


738 


2524 


4310 


6096 


784CIP2B_422 


6595 


739 


' "2525 


4311 


6097 


784CIP2B 423 


6599 


740 i 


2526 


4312 


6098 


784CIP2B 424 


6625 


741 


2527 


4313 


6099 


784CIP2B 425 


6625 
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SKQ lb NO:' 




SEO ID NO* 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 






length 


full- 


nucleotide 


of contig 


c or re spondi ng 


tr. 3 . S -N 


nucleotide 


length 


sequence 


peptide 


SBQ ID NO: in 


09/48B 725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




742 


2528 


4314 


6100 


784CIP2B 426 


6626 


743 


2529 


4315 


6101 


784CIP2B 427 


6630 


744 


2530 


4316 


6102 


784CIP2B_42B 


6631 


745 


2531 


4317 


6103 


784CIP2B 429 


6632 


746 


2532 


4318 


6104 


784CIP2B 430 


6633 


747 


2533 


4319 


6105 


784CIP2B 431 


6634 


748 


2534 


4320 


6106 


784CIP2B 432 


6638 


749 


" 2535 


4321 


6107 


784CIP2B 433 


6641 


750 


2536 


4322 


6108 


" 784CIP2B 434 ' 


i 6644 


751 


" 2*37 


4323 


6109 


784CIP2B 435 


6646 


752 


2538 


4324 


6110 


784CIP2B 436 


6648 


753 


2539 


4325 


6111 


784CIP2B 437 


6652 


754 


2540 


4326 


6112 


784CIP2B 438 


6654 


755 


2541 


4327 


6113 


! 784CIP2B 439 


fifi 57 


756 


2542 


4328 


6114 


794PIPPB 44A 


0030 


757 


2543 


4329 


6115 


7B4CIP2B 441 


6663 


758 


2544 


4330 


£ll£ 


784CIP2B 442 


6664 


•759 


254S 


4331 


6117 


7B4PIP9B 44^1 


DDOO 


760 


2546 


4332 


6118 


7B4CIP7B 444 


D 00 7 


761 


2547 


4333 


6119 


7RdrTP5U 4AC: 
IOH\~XC£.0 ***i3 


DO / J 


762 


2548 


4334 


6120 


784CIP2B 


OOQ3 


763 


2549 


4335 


6121 


7R4PTDOO 44^ 


boo / 


764 


2550 


4336 


6122 


7B4PTP2B 44A 


OOC7 


765 


2551 


4337 


6123 


7R4PTP9R 440 


OQ3J 


766 " 


2552 


4338 


6124 




<f £ Q Q 


767 " 


2553 


4339 


6125 


784PTP5R 4m 


0073 


7<?8 ; 


! 2**4 


4340 


6126 


7R4CIP2B 459 

tat ^ cap X £t 


6705 


769" 


2555 


4341 


6127 


784PIP9B 45"* 




770 . 


2556 


4342 


6128 


7B4CTP2B 4*54 


O /-J. J 


771 


2557 


4343 


6129 


7B4C , IP2B 455 


6716 


772 


2*5B 


4344 


6130 




P /^3 


773 


2559 


4345 


6131 


7B4CIP2B 457 


672 6 


774 


2560 


4346 


6132 


7B4CIP2B «45R 


6727 


775 


2561 


4347 


6133 


7B4nTP9B 4^Q 


o 'JU 


j 776 


2562 1 


4348 


6134 


7fl4r , TP3R 4K0 


c*7"a ft 


777 


2563 


4349 


* 6135 


7B4CTP2B 4ff1 


* £"7^ ft 


778 


2564 


4350 


6136 


7B4PTP2B 4K9 

/ V!tV»krAO ISA 


D / J i 


779 


2565 


4351 


6137 


7B4CTP2B 4K'< 

/D1L4CAO 1DJ 


6733 


780 


2566 


4352 


6138 


784CIP3B 4 £4 




781 


2567 


4 3 53 


6139 


7B4CIP2B 465 


6745 


782 


2568 


4354 


6140 


7B4CIP2B 466 


6751 


783 


2569 


4355 


6141 


7B4CZP2B 467 


6754 


784 


2570 


4356 


6142 


784CIP2B 46"B 


6758 


785 


2571 


4357 


6143 


784CIP2B 469 j 


6761 ' 


786 


2572 - 


4358 


6144 


784CIP2B 470 


6765 


787 


2573 


4359 


6145 


784CIP2B 471 


6768 \ 


788 


2574 


4360 


6144 


784CIP20" 472 " 


6773 


789 


2*75 


4361 


6147 


784CIP2B 4 73 


6776 


79<T 


2576 


4362 


6148 i 


784CIP2B_474 


6796 


791 


2577 


4363 


6149 


784CIP2B 475 


6798 


792 


.2578 


4364 


6150 


784ClP2B 476 1 


6823 


793 


2579 


4365 


6151 


784CIP2B 477 


6825 


794 


2580 


4366 


6152 


784CIP2B_478 


6826 


795 


2581 


4367 


6153 


784CIP2B_479 


6839 


796 


2582 


4368 


6154 


784CIP2B__4 80 


' 6*844 


797 


2583 


4369 


6155 


784CIP2B 482 


6849 


798 


2*84 


4370 


6156 


784CIP2B_483 


6B54 


799 


2585 


4371 


6157 


784CIP2B_484 


6857 


800 


2586 


4372 


6158 


784CIP2B_485 


6861 


801 


25B7 


4373 


<Jl*9 


784CIP2B_486 


6873 


602 


258B 


4374 


6160 


784CIP2B 487 


6875 


803 


2589 


4375 


6161 


784CIP2B_488 


6877 
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SEQ ID NO; ' 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: Of 

full- 
length 
peptide 
sequence 


of con tig 

nucleotide 

sequence 


oay id 
NO: 

of eonfcicr 

peptide 

sequence 


Priority 
doclcet number 

SEQ ID NO: in 

priority 

application 


SEQ ID 
NO: in 
U . S S.N. 
09/4BS 725 


804 


2590 


4376 


" 6-162' 


?84CIP2B 489 


6880 


805 


2591 


4377 


6163 


784CIP2B_490 


6885 


806 


2592 


4378 


6164 


784CIP2B_491 


6890 


807 


2593 


4379 


6165 


784CIP2B 492 


6890 


808 


2594 


4380 


6166 


784CIP2B 493 


6894 


809 


2595 


4381 


6167 


784CIP2B 494 


1 6901 


810 


2596 


4382 


6168 


784CIP2B 495 


6904 


811 


2597 


4383 


6169 


784CIP2B 496 


6907 


812 


2598 


4384 


6170 


7B4CIP2B 497 


6914 


813 


2599 ■ 


4385 


6171 


784CIP2B 498 


6917 


814 • 


2600 


4386 


6172 


784CIP2B 499 


6923 


815 


2601 


4387 


6173 


764CIP2B 500 


6929 


816 


2602 


4388 


6174 


7fi4C!l'P2B ~*T(Y\ 


caii 


817 


2603 


4389 


$17^ 


7B4CIP2B 502 




818 


2604 


4390 


6176 




c CkA n 

oyiu 


819 


2605 


4391 


6177 


784CIP2B 5f)d 


o343 


820 


2606 


4392 


6178 


784C^P2B ^05 




B21 


2607 


4393 


6179 




KQ47 
1)74 / 


822 


2608 


4394 


6180 


7ftAf , TP9T» cm 


6949 


823 


2609 


4395 


6181 


"7fldr , T , P9n cftft 


6959 


824 


2610 


4396 - 


6 182 


7HAr*TD3R 


by fau 


B25 


2611 


4397 


6183 




D702 


826 


2612 


4398 


6184 


'OILiriO 311 


6963 


827 


2613 


4399 


6185 


/ O** LlrZfl D lz 


6967 


828 


2614 


4400 


6186 


(OlUlr^o DlJ 


6>983 


829 


2615 


4401 


6187 


'0**V.ltr*O 314 


6988 


B30 


ieie 1 


4402 


"bTBST 


/04V>lr'4Sll 313 


6996 


B31 


2617 


4403 


6199 


784PTP2R 51 A 


7 003 


832 


2618 


4404 


6190 


7B4CTP2B qi7 


7016 


833 


2619 


4405 


6191 


7B4m'P2B ' Cl A 


7017 


834 


2620 


4406 


6192 


7fi4pTp2B ci Q 


7025 


[ 835 


2621 


4407 


6193 


784PTP3B 52fi 

» Oft \rllr«bX3 3^U 


•7 A*> C 


834 


2622 


4408 


6194 


7H4CIP2B i?l 


/U£3 


837 


2623 


4409 


6195 


7fl4f*TP2B 57"> 


7050 


838 


2624 


4410 


£l96~" " 


7R^rTD3H tlOi 
/O4iullr40 3/J 


7051 


839 


2625 


4411 


" 6197 


ronuir^o 3«4 


* — 7nec 

7033 


840 


2626 


4412 


6198 




7060 


841 


2627 


4413 


6199 


784CIP2B 52fi 


/UQ4 


842 


2628 


4414 


6200 


7fi4CIP2B C37 


70^7 


843 


2629 


4415 


6201 


784CTP9B 52ft 


fU /l 


844 


'■■ 2*30 


4416 


6202 


784CIP2B 529 


(VIA 


845 


2631 


4417 


6203 


7B4CIP2B 5*50 


/U /3 


646 


2632 


4418 


6204 


784CIPiB"531 " 


7076" " 


847 


2633 


4419 


6205 


784CIP2B 532 


7088 


848 


2634 


4420 


6206 


784CIP2B 533 


7089 


849 


2635 


4421 


6207 


784CIP2B 534 


7091 


850 


2636 


4422 


6208 


' 784CIP2B 33 a 1 " ~ 


7091 


851 ■ 


2637 


4423 


6209 


784CIP2B_536 


7104 


852 


2638 


4424 


6210 


784CIP2B 537 


7105 


853 


2639 


4425 


6211 


784CIP2B_538 


7105 


854 


2640 


4426 


6212 


784CJP2B 539 


7109 


855 


2641 


4427 


621} 


784CIP2B 5*46 


7109 


8*6 


2642 


4428 


6214 


784CIP2B 541 


7119 


857 


2643 


4429 


6215 


784CIP2B_542 


7120 


858 


2644 


4430 


6216 


784CIP2B 543 


7121 


859 


2645 


4431 


6217 


784ClP2B_5 ! 44 


7126 


860 


2646 


4432 


6218 


784CIP2B_545 


7127 


861 


2647 


4433 


6219 


784CIP2B_546 


7130 


862 


2648 


4434 


6220 


784CIP2B_547 


7131 


863 


2649 


4435 


6221 


784CIP2B 548 


7144 


864 


2650 


4436 


6222 


784CIP2B 549 


7159 


865 — " 


2651 


4437 j 6223 


784CIP2B 550 


7163 ^ 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO; 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number_ 


NO:in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 




sequence 


priority 






ocqusnce 






appl i cation 




866 


O C ^O 




CO "3 A 


i o1L.±r ZB 5sl 


7175 


867 








TOdftTJIB CCO 


7188 


BOO 






co o c 


'U4(JIP2B 553 


7189 


869 


9CCC 


*i fifi J. 


CO oo 


OOaOTDOn CC/ 
toHV Lr^Q Z>->*t 


7190 


870 






co o a 
b2tf0 


7o4uIrZo bob 


7191 


fill 


6Q9 / 


44 43 


coo a 


ORAr*Ttaop Had 

/o4C1P2B bbo 


7203 


0*72 


OC*Kp" 

4 ODD 


4444 


6230 


784CIP2B 557 


7204 


O / J 


2659 


4445 


6231 


704CIP2B 55B 


7208 


o 


oggn 


4446 


6232 


784CIP2B 559 


7209 




2661 


4447 


6233 


784CIP2B 560 


7210 


876 


2662 


4448 


6234 


784CIP2B 561 


7216 ! 


877 


2663 


4449 


6235 


784CIP2B_562 


7221 


878 


2664 


4450 


6236 


784CIP2B_563 


723 0 


879 


2665 


4451 


6237 


784CIP2B_564 


7237 


880 


2666 


4452 


6238 


784CIP2B_565 


7240 


881 


2667 


4453 


6239 


7B4CIP2B 566 


724 5 


882 


2668 


4454 


6240 


784CIP2B_567 


7250 


883 


2669 


1 4455 


6241 


784CIP2B 56B 


7251 


884 


2670 


4456 


6242 


784CIP2B_569 


7255 


885 


2671 


4457 


6243 


784CIP2B 570 


7260 


886 


2672 


4458 


6244 


784CIP2B 571 


7265 


887 


J 2673 


4459 


6245 


784CIP2B_572 


7268 


888 


2674 


4460 


6246 


784CIP2B 573 


7275 


889 


2675 


4461 


6247 


784CIP2B__574 


7279 


890 


2676 


4462 


6248 


784CIP2B_575 


7283 


SB1 


2677 


4463 


6249 


784CIP2B_576 


7283 


892 


2678 


4464 


6250 


784CIP2B 577 


7287 


693 


2679 


4465 


6251 


784CIP2B 578 


7301 


B94 


2680 


4466 


6252 


784CIP2B_579 


7308 


895 


2681 


4467 


6253 


784CIP2B_5B0 


7308 


896 


2682 


446B 


6254 


784CIP2B 581 


7309 


BS7 


2683 


4469 


6255 


784CIP2B_582 


7319 


898 


2684 


4470 


6256 


7B4CIP2B 583 


7320 


899 


26BS 


4471 


6257 


784CIP2B_584 


7326 


900 


2686 


4472 


6258 


784CIP2B_585 


7326 


901 


2687 


4473 


'6259 


784CIP2B 586 


' 7334 


902 


2688 


4474 


6260 


78 4CIP2B_587 


7337 


903 


2689 


44 75 


6261 


784CIP2B_58B 


7339 


904 


2690 


4476 


6262 


784CIP2B_589 


7344 


905 


2691 


4477 


'6263 


7B4CIP2B_590 


7355 


906 


2692 


4478 


6264 


784CIP2B_591 


7363 


907 


2693 


4479 


6265 


784CIP2B_592 


7363 


908 


.2694 


4480 


6266 


784CIP2B 593 


7365 


909 


2695 


4481 


6267 


784CIP2B 594 


7368 


a io 


2636 


4482 


6268 


784CIP2B 59S 


7369 


911 


2697 


44 83 


6269 


784CIP2B_596 


7372 


Q1 O 

via 


">C O Q 

Zx> jo 


44 B4 


6270 


784CIP2B — 599 


7375 


71J 


20 S3 


4485 


6271 


784CIP2B__600 


7381 




o o n n 


44 86 


6272 


784CIP2B_601 


73 83 


915 


OOfi"! 
<i / UX 


44 87 


6273 


784CIP2B 602 


7387 


01 c 


z / uz 


4488 


6274 


784CIP2B 603 


7391 


Q1 7 

7JL / 




44 89 


6275 


7B4CIP2B_604 


7393 


QI Q 


2 /U4 


4490 


6276 


784CIP2B 605 


7395 


919 


2705 


4491 


6277 


/ O ^ LIr<B Dwo 


I / 


920 


2706 


4492 


6278 


7B4CIP2B 607 


7399 


921 


2707 


4493 


6279 


784CIP2B_60B 


7405 


922 


2708 


44 94 


6280 


7B4CIP2B_609 


7406 


923 


2709 


4495 


6281 


7B4CIP2B_610 


7406 


924 


2710 


4496 


6282 


784CIP2B_611 


7409 


925 


2711 


4497 


6283 


784CIP2B_612 


7410 


926 


2712 


4496 


6284 


784dP2B_613 


7411 | 


927 


2713 


4499 


6285 


784CIP2B 614 


7417 



285 



WO 01/53312 



PCT/US00/34263 



oC>U 1U wu - 

of full- 
length 
nucleotide 
sequence 


WO: Of 

full- 
length 
peptide 
sequence 


an\J xu NO: 
of eontltr 

nucleotide 
sequence 


SEQ ID 
NO : 

of contia 

peptide 

sequence 


Priority 

not ft. tit. nuinocx 

SEQ ID NO: in 

priority 

application 


SEQ ID 
NO : in 
tt q e M 

09/488 725 


928 


2714 


4500 


6286 


784CIP2B_615 


7418 


929 


271* 


4501 


6287 


784CIP23_616 


7421 


930 


2716 


4502 


6288 


784CIP2B 617 


*74 , 22" ,, ■ ' 


931 


2717 


4503 


6269 


784CIP23_618 


7422 


932 


2718 


4504 


6290 


784CIP2B_619 


7423 


933 


2719 


4505 


6291 


784CIP23 620 


7424 


934 


2720 


4506 


6292 


784CIP2S 621 


7426" 


935 


2721 


4507 


6293 


784CIP23 622 


7427 


936 


2722 


4508 


6294 


784CIP23 623 


7428 


937 


2723 


4509 


6295 


784CIP2B 624 


743O 


938 


2724 


4510 


6296 


784CIP23 625 


74-35; 


939 


2725 


4511 


6297 


7S4CTP2B 62fi 


7437 


940 


2726 


4512 


6298 






941 


2727 


4513 


6299 


7B4PTP9R fi7ft 




942 


2728 


4514 


^300 




744 2 


943 


2729 


451S 


6301 


7PAr*TOon cin 

/ OTt^-.Lr'ZrJ DJU 


7450 


944 


I 2730 


4516 


6302 


/OtV.ir£3 Ojl 


Ji 


945 


2731 


4517 


6303 


/ a ^1 fa d^oj z 


T A C 0 


946 


2732 


4518 


6304 


TBAPTD?^ Gil 


/ 1 Dsl 


947 


2733 


4519 


6305 


— 7fiArTP?n cij 




948 


2734 


4520 


DJU 0 






949 


2735 


4521 


63 07 


/afi<wiir zo ojb 


7461 


950 


2736 


4522 


630 B 


IflAPTDTD fill 


1 AC 1 


951 


2737 


4523 


6309 


'HI v- X **Z a OJO 


*7A CC 


952 


2738 


4 524 


OJJLU 




7469 


953 


2 739 


4525 






7473 


954 


2740 


452£j 


6312 


lUAPTBin Cat 


7481 


955 


2741 


4527' " — 


6313 




/•o « 


956 


2742 


4528 






74 82 


957 


2743 


4529 


6315 


Ifl^PTD^B CAZ 




958 


2744 


4530 


do JLO 


/ B** V- X F A 0_^t>4 3 


74 8 5 


959 


2745 


4531 


6317 




7 4 £ 


960 


2746 


4532 


^3 18 


fO'&\*XFAB OQ / 


7487 


961 


2747 


4533 






7491 


962 


2748 


4534 


6320 




7492 


963 


2749 


" 45T5 " 


6 321 




TA (3 A ~ 


964 


2750 


4535 


6322 


TflAHTlDOU CCl 


1A QD 


'"" " 965 " ' 


2751 


4537 


6323 






966 


' 2752 


4538 


6324 




7CAQ 

l Duo 


967 


' 2753 


4539 


6325 


7fi4riP7B 
'oiLir^o osi 


7516 


9^8" " 


" "2^54 


1540 


" g 


7B4CIP2~B £<< 




969 


2755 


4541 


6327 


7B4PTP2H CCC 


7519 


970 


2 756 


4542 


6328 


7B4CIP2B fi^7 


7521 


971 


2757 


4543 


6329 


784CIP23 fi<5fl 


7529 


972 


2758 


4544 


6330 


784CIP2B 659 


7532 


973 


2759 


4545 


6331 


784CIP23 660 


7533 


974 


2760 


454 6 


6332 


784CIP2B 661 


" 7535" 


975 


2761 


4547 


6333 


784CIP2B 662 


"7545 


976 


2762 


454 8 


6334 


7B4CIP2B 663 


7546 


977 


2753 


454 9 


6335 


784CIP2B 664 


7552 


978 


2764 


4550 


6336 


784CIP2B_665 


7554 


979 


2765 


4551 


6337 


784C1P2B 666 


7567 


980 


2766 


4^52 


633 8 


784ClP'23_g6"7 


7569" 


981 


2767 


4553 


6339 


784CIP2B_668 


7575 


962 


2768 


4554 


6340 


784CIP23_669 


7576 


983 


2759 


4555 


6341 


784CIP23_670 


7577 


984 


2770 


4556 


6342 


784CIP2B_671 


7579 


985 


2771 


4557 


6343 


784CIP23_672 


7582 


986 


2772 


4558 S 


6344 


784CIP2B_673 


7587 ~~ 


987 


2773 


4559 


6345 


784C1P23_674 


7589 


968 


2774 


4560 


6346 


784CIP2B_675 


7597 


989 


2775 


4561 


6347 


784CIP2B 676 


■ 1 7X97 
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SEO TD NO- 

of full- 
length 
nucleotide 
sequence 


o&U Lxj 
NO: of 
full- 
length 
peptide 
sequence 


nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 

n Q C NT 
QQ/4QP 75c 


990 


2776 


4562 


6346 


7B4CIP2B_677 


7609 


991 


2777 


4563 


6349 


784CIP2B_678 


7609 


992 


2778 


4564 


6350 ' 784CIP2B 679 


7609 


993 


2779 


4565 


6351 


7B4CIP2B 6fl0 


7613 


994 


2780 


4566 


6352 


7B4CTP7_'5 Sfll 


7623 


995 


27B1 


4567 


6353 


7R4PTP7;* fi*17 
1 DILI tr £. O Ot]z 


7629 


996 


2782 


4568 


6354 




j 7"6'30 


I 997 


2783 


4569 








998 


2784 


4570 


6356 

0 J DO 


7fl/irTO'JD foe 


7635 


999 


2785 


4571 


OJD 1 


POO 


/ 0 j 0 


1000 


278*" " 


4572 


C D 


784CIP2B 687 


7ff"*q 


1001 


2787 " * 


4573 


cT5q 


7S4CIP2B 688 


7C4C 


1002 


2788 


4574 


b Jb U 


784CIP2B 689 


7647 


1003 


2709 


4fa75 


6361 


784CIP2B 690 


7648 


1004 


2790 


4576 


6362 


784CIP2B 691 


4 d go 


1005 


2791 


4577 


o3o3 


784CIP2B 692 


7oo4 


1006 


2792 


4.578 


6364 


784CIP2B_693 




1007 


2793 


4579 


6365 


j 784CIP2B_695 


7674 


1008 


2794 


4580 


6366 


784CIP2B_696 


7o Jb 


1009 


2795 




6367 


7B4CIP2B 697 


7676 


1010 


2796* 




6368 


784CIP2B 698 


7681 


1011 


2797 


4583 


6369 


784CIP2B_699 


7688 


1012 


2798 


4584 


6370 


784CIP2B 700 


7693 


1013 


2799 


4585 


6371 


784CIP2B 701 


7694 


1014 


2800 


4 SfS 


6372 


784CIP2B L _702 


7715 


1015 


2801 


4587 


6373 


784CIP2B 703 


7716 


1016 


2802 


4588 


6374 


784CIP2B 704 


7718 


1017 


2803 


4589 


6375 


784CIP2B_705 


7721 


1018 


2804 


4590 


6*376 


784CIP2B 706 


7723 


1019 


2805 


/col 


6377 


784CIP2B 707 


7729 


1020 


2806 


4592 


R-k'ih 


784CIP2B 708 


7733 


1021 


2807 


4593 


6379 


784CIP2B 709 


TrTc 

7735 


1022 


2806 


4594 


(Tiro 


7 84C1P2B_710 


7741 


1023 


2609 


4595 


6381 


784CIP2B 711 


7743 


1024 


2810 


4596 


OJOZ 


7o4CIP2B_712 


7748 


1025 


2811 


4597 


' D JO J 


/o4CXrZB 713 


7749 


1026 


2812 


4598 


63 B4 


/04UX,lr^o /J.4 


7750 


1027 


2813 


4&W ' 


5385 




7757 


1028 


2814 


4600 


6386 


7S4f , TP7B 7i#; 

/O*ii*JUIr£0 r id 


7759 


1029 


2815 


■ 4601 


6387 






1030 


2816 1 


4602 


6388 


7R4r , TP3fi 7"t n ' 


7727} 


1031 


2817 


4603 


6389 


784CIP2B 719 


7764 


1032 


2818 


4604 


6390 


7B4CIP7B 770 


*)<7CC 

77o3 


1033 


2819 


4605 


6391 


784PIP7R 771 


77CC 


1034 


2820 


4606 


6392 


784CIP2B_722 




1035 


2821 


4607 


6393 


784CIP2B 723 


7769 


1036 


2822 


4608 


6394 


784CIP2B 724 


777fl 


1037 


2823 


4609 


6395 


784CIP2B_725 


7774 


1038 


2B24 


4610 


6396 


784CIP2B 726 


7779 


1039 


2825. 


4tfll 


6397 


784CiP2B_727 


7781 


1040 


2826 


4612 


6398 


784CIP2B 728 


7782 


1041 


2827 


4613 


6399 


784CIP2B 729 


7783 


1042 


2828 


4614 


6400 


7B4CIP2B_730 




1043 


2829 




6401 


7B4CIP2B_731 


7792 


1044 


2830 


4616 


6402 


784CIP2B 732 


7795 


1045 


2831 


4617 


6403 


7B4CIP2B 733 


7801 


1046 


2832 


4618 


6404 


7B4CIP2B 734 


7807 


1047 


2833 


4619 


6405 


784CIP2BJ735 


7808 


1048 


2834 


4620 


6406 


784CIP23_736 


7819 


1049 


2835 


4621 


6407 


784CIP2B 737 


7824 


1050 


2836 


4622 


6408 


784CIP2B 738 


7826 


1051 


2837 


4623 


64 09 


784CIP2B 739 


7829 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


seJO Id 

NO: of 
full- 
length 
peptide , 
sequence 


SEQ ID tiO: 
of contig 
nucleotide 
sequence 


SEQ XD 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number 
corresponding - 
SEQ 3D NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1052 


2838 


4624 


6410 


784CIP2B 740 


7832 


1053 


2839 


4625 


6411 


784CIP2B 741 


7833 


1054 


2840 


4626 


6412 


784CIP2B_743 


7847 


1055 


2841 


4627 


6413 


784CIP2B 744 


784B 


1056 


2842 


4628 


6414 


784CIP2BJ745 ' 


7853 


1057 


2843 


4629 


6415 


784CIP2BJ746 


7854 


lose 


2644 


4630 


6415 


784dP2B_747 


7856 


1053 


2845 


4631 


6417 


784CIP2B 748 


7862 


1060 


2846 


4632 


6418 


784CIP2B_749 


7865 


1061 


284*7 


4633 


6419 


784CIP2B__750 


7874 


1062 


2848 


4634 


6420 


784CIP2B 751 


7877 " 


10S3 


2849 


| 4635 


6421 


784CIP2BJ752 


7880 


1064 


2850 


! 4636 


6422 


784CIP2B_753 


7882 


1065 


2851 " 


4637 


6423 


784CIP2D 754 


7884" " 


1066 


2852 


4638 


6424 


784CIP2B 755 


7886 


1067 " 


2853 


4639 


6425 


784CIP2B 756 


7888 


1068 


2854 


4640 


6426 


784CIP2B_757 


"7889 


1069 


2855" " 


4641 


6427 


784CIP2B 758 


7901 "" 


1070 


2856 


4642 


6428 


784CIP2B 759 


7910 


1071 


2857 


4643 


6429 


784CIP2B 760 


7911 


1072 


2858 


4644 


6430 


7B4CIP2B 76"l 


7921 


1073 


2859 


4645 


6431 


7B4CIP2B 762 


7923 


1074 


2860 


4646 


6432 


7B4CIP2B 763 


7924 


" 1075 


2861 


4647 


6433 


784CIP2BJ764 


7925 


1076 


2862 


4648 


6434 


7B4CIP2B 765 


t 7928 


1077 


2B63 


4649 


6435 


7B4CIP2B itG 


7929 


1078 


2864 


4650 


6436 


784C1P2B 767 


7930 


1079 


2865 


4651 


6437 


784CIP2B 768 


7934 


1080 


2866 


4652 


643 8 


784CIP2B 769 


793 8 


1081 


2867 


4653 


643 9 


784CIP2B_770 


7942 


1082 


2668 


4654 


644 0 


784CIP2B 771 


7945 


1083 


2869 


4655 


6441 


784CIP2B_J772 


7946 


1084 


2870 


4656 


6442 


784CIP2B 773 


7948 


1085 


2871 


4657 


6443 


784CIP2B 7^4 j 


7951 


! 1086 


2872 


4658 


6444 


784CIP2B 775 


7952 


1087. 


2873 


46S9 


6445 


784CIP2B 776 * 


7953 


10B8 


2874 


4660 


6446 


7B4CIP2B 777 


7954 


1089 


2875 


4661 


6447 


784CIP2B 77B 


7957 


1090 


2876 


4662 


5448 


784CIP2B 779 


7958 


1091 


2877 


4663 


6449 


784CIP2B_780 


7961 


1092 


287B 


4664 " 


6450 


784CIP2B 781 


7965 


1093 


2879 


4665 


6451 


784CIP2B_7B2 


7966 


1094 


2860 


4666 ~ 


6452 


764CIP2B 783 


" 1919 


1095 


2881 


4667 


6453 


784CIP2B_784 


7986 


109£ 


2862 


4668 


6454 


784CIP2B 785 


7986 


1097 


2893 


4669 


6455 


784CIP2B 786 


7988 


1098 


2884 


4670 


6456 


784CIP2B 787 


7991 


1099 


2B8S 


4671 


6457 


784CIP2B 788 


7992 


1100 


2886 


4672 


6458 


784CIP2B 789 


7992 ! 


1101 


2887 


4673 


6459 


784CIP2B 790 


7992 


1102 


2888 


4674 


6460 ■ 


784CIP2B 791 


7992 


1103 


2889 


467S 


6461 


784CIP2B 732 


8003 


1104 


2890 ! 


4676 


6462 


7B4CIP2B 793 


8014 


1105 


2091 


4677 


6463 


784CIP2B 794 


8015 


1106 


2892 


4678 


6464 


7B4CIP2B 795 


8016 


1107 


2B93 


4679 


646S 


784CIP2B 796 


8017 


1108 


.2894 


4680 


" 6466 


784CIP2B 797 


8019 


1109 


2895 


4661 


6467 


784CIP2B_798 


8020 


1110 


2896 


4682 


6468 


784CIP2B 759 


8022 


1111 


2897 


4683 


6469 


784CIP2B BOO 


8022 


1112 


2898 


4684 


6470 


" 7B4CIP2B 801 


8028 


1113 


2B99 


468S 


6471 


784CIP2B 802 


8030 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SBQ ID NO: 
of con tig 
nucleotide 
sequence 


SBQ ID 
NO: 

of contig 

peptide 

sequence 


Priori ty 
docke t numbe r__ 
corresponding 
SEQ ID NO: in 
priority 
application 


NO: in 
U.S. 8 .N. 
09/488, 725 


1114 


2900 


4686 


6472 


7B4CIP2B 803 


8038 


1115 


2901 


4*87 


6473 




| 8042 


1116 


2902 


4688 


6474 


784CIP2B 805 


804^ 


1117 


2903 


4689 


6475 


784CIP2B 806" 


8045 


1118 


2904 


4690 


6476 


784CIP2B B07 


8046 


1119 


2905 


4691 


6477 




1 8047 


1120 


2906 


4692 


6478 


784CIP2B 809' 


^ 8051 


1121 


2907 


4693 


6479 


" TflAPTDOP Qlft 


8059 


1122 


2908 


4694 


6*486 


/04CIP2B 811 


8064 


1123 


2909 


4695 


6481 


784CIP2B 812 


6069 


1124 


2910 


4696 


6482 


784CIP2B 813 


8074 


1125 


2911 


4697 


6483 


j 784CIP2B_814 


8077" 


1126 


2912 


46*98 1 


64 84 


! 7B4CIP2B 815 


8078 


1127 


2913 


4699 


6485 


784CIP2B 816 


8079 


112B 


2914 


4700 


| 6486 


784CIP2B 817 


8084 


1129 


2915 


4701 


6487 


784CIP2B 818 


8083 


1130 


2916 


4702 


6488 


7B4CIP2B 819 


8090 


1131 


2917 


4703 


"' 44B9 


7B4CIP2B 820 


8091 


1132 


2918 


4704 


6490 


7B4CIP2B 821 


onQQ 

OU?7 


1133 


2919 


4705 


6451 


7B4CIP2B 822 


ariqo 


1134 


2920 


4706 


6492 


7B4CIP2B 823 


8100 


1135 


2921 


4707 


6493 


784CIP2B 824 




1136 


2922 


4708 


6494 


784CIP2B 825 


PI At 


1137 


2923 


j 4709 


6495 


784CIP2B 826 


8103 


1138 


2924 


4710 


6496 


784CIP2B 827 


8104 


1139 


2925 


4711 


6497 


784CIP2B 828 


8108 


1140 


2926 


4712 


6498 


7B4CIP2B 829 


8110 


1141 


2927 


4713 


6499 


784CIP2B 830 


OX 10 


1142 


292B 


4714 


6500 


784CIP2B 831 


8117 


1143 


2929 


4715 


6501 


7B4CIP2B 832 


8123 


1144 


2930 


4716 


6502 


7B4CIP2B B33 


813 0 


1145 


2931 


4 717 


6503 


784CIP2B^834 


8130 


1146 


2932 


4718 


6504 


/84CIP2B 835 


arai 


1147 


2933 


4719 


6305 


836 


Din 


1148 


2934 


4720 


6506 


/o4ulr2B B37 


■™B154 


1149 


2935 


4721 


1 6507 


/ oh\ — le^o oJo 


8155 


1150 


2936 


4 722 


6508 




8162 


1151 


. 2937 


4723 


6509 




&l£3 


1152 


2938 


4 724 


6"S"lO 




8172 


1153 


2939 


4725 


6511 




8173 


1154 


2940 


4726 


6512 




8179 


1155 


2441 


4727 


6513 


784CIP2B 844 


8182 


1156 


2942 


4728 


6514 


784CIP2B 845 


8183 


1157 


2943 


4729 


6515 


784CIP2B 846 


8184 


1158 


2944 


4730 


6516 


784CIP2B 847 


8185 


1159 


2945 


4731 


6517 


784CIP2B 848 


8187 


1160 


294 6 


4732 


6518 


784CIP2B 849 


8188 


1161 


2947 


4733 


6519 


784CIP2B 850 


8190 


1162 


294 B 


4734 


6520 


784CIP2B B51 


8190 


1163 


2549 


4735 


6521 j 


784CIP2B_852 " 


8192 


1164 


2950 


4736 


6522 


784CIP2B 853 


8193 


ii6^ 1 - 


"2951 


4737 


6523 


784CIP2B 854 


8197 


1166 


2952 


4738 


6524 


784CIP2B_B55 


8197 


1167 


" 2953 


4739 


6525 


784CIP2B 856 


8199 


1168 


2954 


4740 




784CIP2B_8S7 


8202 


1169 


" 2955 


4741 


6527 


784CIP2B_858 


8203 


1170 


2956 


4742 


6528 


7B4CIP2B_859 


8208 


1171 


2957 


4743 


. 6529 


784CIP2B 860 


8209 


1172 


2958 " " 


4744 


6530 


784CIP2B_861 


8211 


1173 


2959 ~ 


4745 


6531 


7B4CIP2B 862 


8214 


1174 


2960 


4746 


6532 


784CIP2B 863 


8217 


1175 


29?i 


4747 


6533 — 


" 784CIP2B 864 


8223 - 
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SEQ ID NO: 


SEQ ID 


SKQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: o£ 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleot ide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




1176 


. /. y fai£ 


4 748 


6534 


784CIP2B_B65 


8224 


1177 




4749 


6535 


784CIP2B 866 


8226 • 


J.J. f o 


2964 


4750 


6536 


784CIP2B 867 


8227 


ino 

J.X f3 


2965 


4751 


6537 


784CIP2B_868 


8229 


1180 


2966 


4752 


6538 


784CIP2B_869 


8232 


1181 


2967 


4753 


6539 


784CIP2B_870 


B236 


1182 


2968 


4754 


6540 


784CIP2B_871 


B239 


1183 


2969 


] 4755 


6541 


784CIP2B 872 


6244 


1184 


2970 


4756 


6542 


784CIP2B_873 


8245 


1185 


2971 


4757 


6543 


784CIP2B_874 


8248 | 


118b 


2972 


4758 


6544 


784CIP2B 875 


8251 


1187 


2973 


4759 


6545 


784CIP2B 876 


8253 


1188 


2974 


4760 


6546 


784CIP2B_877 


8260 


1189 


2975 


4761 


6547 


784CIP2B 878 


8262 


1190 


2976 


4762 


6548 


784CIP2B_B79 


8268 


1191 


2977 


4763 


6545 


784CIP2B_BB0 


8270 


1192 


297B 


4764 


6550 


784CIP2B 881 


8272 


1193 


2979 


4765 


6551 


784CIP2B_882 


8274 


1194 


2980 


4766 


6552 


784CIP2B 883 


8274 


1195 


2981 


4767 


6553 


764CIP2B 884 


8275 


! 1196 


2982 


4768 


6554 


784CIP2B_885 


8277 


1197 


2983 


4769 


6555 


784CIP2B 886 


82B1 


1198 


2984 


4770 


~" 6*5^ 


784C1P2B 887 "" 


8283 


1199 


2985 


4771 


6557 


784CIP2B 888 


8289 


1200 


2986 


4772 


6558 


784CIP2B 889 


8295 


1201 


2987 


4773 


6559 


784CIP2B 890 


8300 


1202 


2988 


4774 




784CIP2B_891 


8303 


1203 


2989 


4775 


6561 


784CIP2B_892 


8304 


1204 


2990 


4776 


6562 


784CIP2B_893 


8305 


120$ 


2991 


4777 


6563 


784CIP2B_894 


B309 


1206 


2992 


4778 


6564 


784CIP2B_895 


8318 


1207 


2993 


4779 


6565 


784CIP2B 896 


8319 


1208 


2994 


4780 


6566 


784CIP2B_897 


8321 


1209 


2995 


4781 


6567 


784CIP2B 898 


8322 


1210 


2996 


4782 


6568 


7B4CIP2B_B99 


8323 


• 1211 


2997 


4783 


6^9 


784CIP2B_900 


B325 


1212 


2998 


4784 


6570 


784CIP2B_90l 


8331 


1213 


2999 


47B5 


6571 


784CIP2B_902 


6332 


1214 


3000 


47B6 


6572 


784CIP2B 903 


8333 


1215 


3001 


4787 


6573 


784CIP2B_904 


B33<J 


1216 


3002 


4788 


6574 


784CIP2B 905 


8336 


1217 


3003 


4789 


6575 


784CIP2)3 906* 


8337 


1218 


3004 


4790 


6576 


784CIP2B 907 


8340 


1219 


3005 


4791 


6577 


784CIP2B 908 


8343 


1220 


3006 


4792 


6578 


7B4CIP2B_909 


8347 


1221 


3007 


4793 


6S79 


784CIP2B 910 


8349 


1222 


3008 


4794 


6580 


7B4CIP2B 911 


8351 


1223 


3009 


4795 


65B1 


784CIP2B 912 


8353 


1224 


3010 


4796 


6582 


784CIP2B 913 


8355 




3 011 


4797 


6583 


784CIP2B 914 


8361 




3012 


4798 


6584 


784CIP2B_915 


8365 


/ 


3013 


4799 


6585 


784CIP2B_916 


8367 


1228 


3014 


4800 


6586 


784CIP2B_917 


8369 


1229 


3015 




C cp'j 
ODO f 


/B3CXfe'40_319 


8375 


1230 


3016 


4802 


6588 


7B4CIP2B_920 


8387 


1231 


3017 


4803 


■ 6589 


784CIP2B_921 


8391 


1232 


3018 


4804 


6590 


784CIP2B 922 


83 93 


1233 


3019 


4805 


6591 


784CIP2B_923 


8393 


1234 


3020 


4806 


6592 


784CIP2B_924 


8394 


1235 


3021 


4807 


6593 


784CIP2B_925 


" 8395 


1234 


3022 


4808 


6594 


784CIP2B_926 


8396 


1237 


3023 * 


4809 


6595 


784CIP2B_927 


8398 
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SEQ ID NO : 
of full- 
length 
nucleotide 
sequence 


•j&y xu 
NO: Of 
full- 
length 
peptide 
sequence 


z>Et\2 xlj NO: 
vt ct-mcxy 

nucleotide 
sequence 


SBQ ID 

viu : 

ui. couuig 

sequence 


j Priority 
docket number^ 
corresponding 

CEO TO Wft. \r% 

aDDl i ca fc i on 


SEQ ID 
NO: in 
U ,S .S .N. 
09/48B, 725 


" 1238 


3024 


4810 


6596 


784CIP2B 928 


8402 


1239 


3025 


4811 


6597 


784CIP2B 929 


8402 


1240 


3026 


4812 


6598 


784CIP2B 930 


84 05 


1241 


3627 


4813 


6599 


764CIP2B 931 


R4 n£ 

D^l Ub 


1242 


3028 


4814 


6600 


764CIP2B 932 


fl4 no 


1243 


3029 


4 615 


6601 


7B4CIP2B 933 


a a i ft 
OH xu 


1244 


3030 


4816" 


6602 


7B4CIP2R 974. 


o4 X4 . 


124S 


30*1 


4817 


6603 


784CTP3H Q^rr 

r otk.JLr£D 7 J 3 


OH X3 


1246 


3032 


4818 


6604 


7flflPTMR QIC 


8419 


1247 


3033 


4 819 


6605 


704^X0713 Q1«7 


8426 


124B 


3034 " 


4820 


6 1 6 , 06 


'ofiv^xf^jta yjy 


8430 


1249 


3035 


4 821 


6607 


T50T2 Qlft 


8431 


1250 


3036 






ro4Lli^2B 940 


8432 


1251 


3037 


4823 


oouy 


/04L.IP2JB 541 


8433 


1252 


303 8 


4 824 


D o X U 


/84CIP2B 942 


8434 


1253 


303£ 


4 825" 


OOli 


/o4L.XP2B 943 


8438 


1254 


3040 


4826 


do12 


784CIP2B 944 


8439 


1255 


3041 


4 827 


boX J 


784CIP2B 945 


8441 


1256 


3042 


4 828 


DDX4 


784CXP2B 946 


8450 


1257 


3043 


4 829 


DOJ.3 


7B4CIP2B 947 


8451 


1258 


3044 


4 830 


6616 


7B4CIP2B_94 8 


8452 


1259 


3045 


4831 


6617 


784CIP2B_949 


8460 


1260 


3046 


4832 


CC1 Q 

bbXo 


/H4CIF2B 950 


8461 


1261 


304"7 " 


4 833 


Get a 


784CIP2B 951 


8462 


126"2 


304 8 


4834 


bb«£U 


7B4CIP2B 952 


8464 ' 


1263 


3049 


H OJ3 


5621 


784CIP2B_953 


8465 


1264 


3050 


4836 




784CIP2B 954 


8467 


1265 


3051 


4 837 





784CIP2B 955 


8470 


1266 


3052 


4838 




784CIP2B 956 


8471 


1267 


3053 


4 839 




784CIP2B__957 


8473 


1268 


3054 




ceo e: 
bb^b 


784CIP2B 958 


8474 


1269 


3055 


4 841 


C <J 7 "J 


784CIP2B_959 


8475 


1270 


3056 


4 842 


titi?Q 

a oxa 


784CIP2B 96 0 


8476 


1271 


3057 


4 843 


b x>£ y 


784CIP2B 961 


8480 


1272 


3058 


4844 


0 OJU 


784CIP2B 962 


8482 


1273 


3059 


4845 r 


"'g'g-an 


784CIP2B 963 


8482 


1274 


3060 


4846* 


obJz 


784CIP2B 964 


8486 


127* 


3061 


4847 




784CIP2B 965 


8488 


1276 


3062 


4848 






8492 


1277 


" 3063 " 


4849 


6635 ~ 


/B4LIP2B 967 


8494 


1278 


3064 


4650 


6TT3T! 


/o4(_Xi?ZJB 9bo 


8496 


1279 


306*5 


4851 


£77 


/04LxfcVB 969 


8497 


1280 . 


3066 


4652 


6638 


TOilPTBIB MA 


8499 


1281 


3067 


4853 


6639 






1282 


3068 


" 4854' 


6640" 


7R4r*TD7U Q7 7 


8522 


1283 


3069 


4fi5- 


6641 


"7flJPTDOO Oil 


8526 


1284 


3070 


4856 1 


6642 


"7 Q4 r , TD7TJ 07.A 


flCH *" 
OS JJ. 


1285 


3071 


4857 


6643 


7ft4f*TD7R Q7K 




1286 


3072 


4858 


6644 


7H4CTD7R Q7C " " 




1287 


3073 


4859 


6645 


7B4CIP7R 977 




1288 


3074 


4860 


6646 


7fl4C!TD7n <i7R 
/ o *i .j. tr«s p j> / o 




1269 


3075 


4861 


6647 






1290 


3076 


4862 


6648 


1 O V— If 3BU 




12^1 


3077 


48*3 


6649 


784CIP2B 981 


8576 


1292 


3078 


4864 


6650 


784CIP2B 982 


8578 


1293 


3079 


4865 


6651 


784CIP2B 983 


8584 


1294 


3080 


4666 


" 6652 


7B4CIP2B 984 


B^9B 


1295 


3081 


4867 


6653 


784CIP2B 985 


B602 


1296 


3082 


4868 


6654 


784CIP2B_986 


8604 


1297 


3083 


4869 ~| 


6655 


784CIP2B 987 


8609 


1298 | 


3084 


4870 


6656 


784CIP2B 988 


8612 


1299 


3085 


4871 


6^57 


784CIP2B 989 


8637 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO ; of 

full- 
length 
peptide 
sequence 


o&V xv wu: 

rtf />nnt> i it 

ul i— *-j 1 1 1^. x y 
nucleotide 

sequence 


SEQ ZD 
NO • 

of fwnt^tr 
Deo tide 
sequence 


Priority 
docket number^ 
c t/i l capon ai ng 
SEO ID NO» in 

priority 
application 


SEQ ID 
NO : in 
U .S .S .N. 

U7/ loa ; 


1300 


3086 


4872 


6658 


7D4CIP2B 990 


8640 


1301 


3087 


4873 


6659 


784C1P2B 991 


8643 


1302 


3088 


4 874 


6660 


784CIP2B 992 


864£> 


! 1303 


3089 


4875 


I 6661 


784CIP2B 993 


8650 


1304 


3090 


4876 


6662 


784CIP2B 994 


8651 


1305 


3091 


4877 


6663 


784CIP2B 995 




1306 


3092 


4678 


6664 


784CIP2B 996 


8655 


1307 


3093 


4879 


6665 " 


784CIP2B 997 


ODD J 


13 08 


3094 


4880 


6666 


784CIP2B 9QH 


DDD3 


1309 


3095 


4881 


6667 


7fldr*TP5R qqq 


DCCD 

dob a 


1310 


3096 


4882 


5668 


/OILll'^D JLUUU 


RC-M 

do /l 


1311 


3097 


4883 


"ere" 9 


/O'iLir^b JLUU1 


8672 


1312 


3098 


4884 


6670 




8692 


1313 


3099 


4885 


667i"" 




8706 


1314 


3100 


48B6 


DO / ^ 


/□4UJ.F23 1004 


8716 


1315 


3101 


4887 


£6"73 


/o4CLF*dB 1U0& 


8719 


1316 


3102 


4888 


5674 




! 8743 


1317 


3103 


4889 


DO / Z3 


/04L1F2B 1007 


8764 


1318 


3104 


4890 


acne. 
DO /o 


/U4L.1P2B 1Q0B 


8764 


1319 


3105 


TsIjT 

•* a 7 x 


oof/ 


/B4CIF2B 1009 


8764 j 


1320 


3106 


4 892 


b b / o 


/o4(_J.F2B 1010 


8774 j 


1321 


3107 


4893 


DO / 7 


/B4L1JP2B 1011 


8782 


1322 


3108 


4 fl*Jd 


6680 


784CIP2B 1012 


8796 


1323 


3109 


4895 


DOOl 


/ U 4 L. a F 2 B_l 0 1 3 


8B27 


1324 


3110 


4896 


CCO") 
bboz 


/t>4LIP2B 1014 


8842 


132S 


3111 


4897 


e cm 

ODOJ 


/B4C.1P2B 1015 


8842 


1326 


3112 


4898 


6684 


70//* , Tn*>fi -» r> 1 

7S4CIP2B 1016 


8858 


1327 


3113 


4899 


boos 


7o4CIP2B 1017 


8871 


1328 


3114 


4900 


6686* " 


/o4v_IF2B 1018 


8921 


1^29 


3115 


4 901 


ODD / 


7B4CIP2B 1019 


8927 


1330 


3116 


4 902 


Cf DO 


784CIP2B 1020 


8942 


1331 


3117 


4 903 


croQ 

DODJ 


/D4L1F2B 1021 


8994 


1332 


3118 


4904 


£690 ' 


/U9LlriB 1U22 


9023 


1333 


3119 


4905 


CCQ1 


r p4GIF2B_1023 


9028 


1334 


3120 


4906 




784CIP2B_1024 


9058 


133S 


3121 


4907 r 


a D 3J 


/B4CXP2B 1025 


9058 


1336 


3122 


4908 


6694 


/o4LZJlF2ii lU^o 


9079 


1337 - 


3"l2l 


4909 


6695 


/t>4LlF2B 1027 


9079 


" 1338 


3124 ~ 


4910 


6696 


ro4uJLF2B 1028 


9062 


1339 '" 


3125 


4911 


6697 


'O^Ulr^o 1U«7 


9084 


1340 


3126 " 


4912 


6698 " 


/ 0 *■ L.xir XU jU 


9093 


1341 


3127 


49ll 


6~6"99 




9101 


1342 


3126 


4914 


6700 




9103 


1343 


" 3129 


4915 


6701 




9105 


1344 


3130 


4916 


6702 


7BdrTP?R i n**d 


Q1C1 " ' 

7X3 JL 


1345 


3131 


4917 


6703 


7H4rTP5n i n**R 


7 XOX 


1346 


3132 


4918 


6704 


7fldr , TP2R lOlfn 
/ OlLlc o-O JLU J o 


an Ti 


1347 


3133 


4919 


6705 


7B4CIP2B 10"%7 


7X /** 


1348 


3134 


4920 


6706 


7B4CTPPB 1 nip 


92 04 


1349 


3135 


4921 1 


6707 


784CIP2B 1039 


9234 


1350 


3136 


4922 


6708 


7fl4ClP2B I04n 




" 1351 


3137 | 


4923 


6709 


784CIP2B 1041 


7*J7 


1352 


3138 


4924 


6710 


7B4CTP7H 1047 


9< DO 


1353 


3139 


4925 


6711 


7B4CIP2B 1043 


9276 


1354 


3140 


4926 


6712 


784CIP2B 1044 


9345 


1355 


3141 ; 


4927 


6713 


784CIP2B 1045 " 


9379 


1356 


3142 


4928 


6714 


784CIP2B 1046 


9435 


1357 


" 3143 


4929 


6715 


7B4CIP2B 1047 


9437 


1358 


3144 


4930 


6716 


7B4CIP2B 1048 i 


9469 


1359 


3145 


4931 


6717 


784CIP2B 1049 


9500 


1360 


3146 


4932 


6718 


784CIP2B 1050 


9502 


1361 


3147 


4933 


6719 


784CIP2B 1051 


9520 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SBQ ID 
NO: of 
full- 
length 
peptide 
sequence 


cpn Tfi MO • 
of pnn h .i a 

nucleotide 
sequence 


NO: 

o£ contio 

peptide 

sequence 


Priority 
docket number^ 

utJ t iCoptli JULXIiy 

SEO ID NO: in 

priority 

application 


SEQ ID 
NO : in 
U . S . S , N . 

0Q/flRB 77C 


1362 


3148 


4934 


6720 


784CIP2B 1052 


9541 


1363 


3145 


4935 


6721 


784CIP2B_1053 


9541 


1364 


3150 


4936 


6722 


7S4CIP2B 1054 


9548 


1365 


3151 


4937 


6723 


784CIP2B 1055 


9556 1 


1366 


3152' 


4938 


6724 


784CIP2B 1056 


9556 


1367 


3153 


4939 


6725 


784CIP2B 1057 


9575 


1368 


3154 


4940 


6726 


784CIP2B 1058 


9589 


1369 


3155 


4941 


6727 


7B4CIP2B 1059 


9599 


1370 


3156 


4942 


6728 


784CIP2& 106*0 


9602 


1371 


3157 


4943 


6729 


784CIP2B 1061 


7DUO 


1372 


3158 


4944 


6730 


784CIP2H 1062 


QC*J7 


1373 


3159 


4945 


6731 


784CIP2B 1061 




1374 


316"0 


4946 


" 6732 " 


7B4CIP2R infi/l 


or AC 


1375 


3161 


4947 


6733 


/ Dt^lriD Xuo_> 


3/47 


1376 


3152 


4948 


6734 


/OILlr^D 1UOO 


3 773 


1377 


3163 


4949 


6735 


rO^LlrZO J.UD / 


QTQC 

9 / tab 


1378 


3164 


4950 


VJTZ 


/OILif^D JLUDO 


9801 


1379 


3165 


4951 


6737 




9811 


13B0 


3166 


4952 


• 673 8 ™ 




9843 


1381 


3167 


'4953 


6739 


/04U1F*0 xU/X 


9854 


1382 


3168 


4954 


674 0 




9854 


13B3 


31^9 " 


T9 5" 5 "" 






9864 


1384 


3170 


4956 


6742 


7QAPTDOO inia 


9864 


1385 


3171 


4957 


6743 


/ un-xf^p xu 1 3 


9871 


1386 


3172 




674 a 


*7Q A PT 1 Air 
/OSulr«D_lU /fa 


9879 


1387 


3173 


4959 


6745 


70 J /*T D*3t3 1/1*7*7 


9881 


1388 


3174 


4960 


6746 


/D*3UJLfZH 1Q7B 


9885 


1389 


3175 


4961 


6747 


/OIUXrZD XU/;* 


9901 


1390 


3176 


4962 


6748 


7ftAr*TD">a man 


9912 


1391 


3177 


4 963 


6749 




9916 


1392 


3178 


4964 


6750 




9921 


1393 


3179 


4965 


6751 


/Ot^lfZO J.UOJ 


9925 


1394 


3180 


4966 


6752 


*7PdPTD9tl inQA 


993 0 


1395 


3181 


4967 


6753 




9949 


1396 


3182 


4968 


67£>4 




9951 


1397 


3183 


4969 r 


6755 


/O^UXIf^D lUo/ 


9959 


1398 


3184 


4970 


6756 




9973 


1399 


3185 


4971 


6757 


7 Rd PT 139(1 mOQ 


9962 


1400 


3186 


4972 


6758 


"7ttAPTt>*3Ta T AQA 
/OILlf ZD 1U?U 


999*1 


1401 




4973 


6759 




10021 


1402 


3188 


4974 


5760 


784CIP7H 1097 




14 03 


3189 


4975 


5761 




1 PmC7 


1404 


3190 


4976 


6762 


784CxP2B 1095 




1405 


3191 


4977 


6763 


7B4CIP2B 1096 




1406 


3192 


4978 


6764 


784CIP2B 1097 


i m i 7 


1407 


3193 


4979 


6765 


784CIP2B 1098 


10132 


1408 


3194 


4980 


6766 


784CIP2B 1099 


xuxo3 


1409 


319$ 


4981 


6767 


784CIP2B 1100 


10217 


1410 


3196 


4982 


6768 


784CIP2B 1101 


10226 


1411 


3197 


4983 


6769 


784CIP2B 1102 


10232 


1412 


3198 


4984 


6770 


784CIP2B 1103 


10237 


1413 


3199 


4985 


6771 


784CIP2B 1104 


10279 


1414 


3200 


4966 


6772 


784CIP2C 1 


00 


1415 


3201 


4987 


6773 


784CIP2C 2 


271 


1416 


3202 


4988 


6774 


784CIP2C 3 


848 


1417 


3203 


4989 


6775 


784CIP2C 4 


849 


1418 


3204 


4990 


6776 


784CIP2C_5 


864 


1419 


320S 


4991 


6777 


784CIP2C_6 


953 


1420 


3206 


4992 


6778 


784CIP2C 7 


980 


1421 


3207 


4993 


6779 


784CIP2C 8 


1595 


1422 


3208 


4994 


6780 


784CIP2C_9 


1697 


1423 


3209 


4995 


6781 


784CIP2C_10 


1744 
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SEQ ID NO: 
of full- 
length 
nucleotide 


SEQ ID 
NO; of 
full- 
length 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 
peptide 


riionty 
docket numhpr 

corresponding 
SEQ ID NO: in 


SEQ ID 
NO • in 
U.S .S.N. 
09/488, 725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




" 1424 


3210 


| 4996 


6782 


784CIP2C 11 


1937 


1425 


3211 


4997 


6783 


784CIP2C 12 


1955 


1426 


3212 


| 4998 


6784 


784CIP2C 13 


1955 


1427 


3213 


4999 


6785 


784CIP2C 14 


2185 


1428 


3214 


5000 


6786 


784CIP2C_1S 


2889 


1429 


3215 


5001 


6787 


784CIP2C 16 


2901 


1430 


3216 


5002 


6786 


784CIP2C_17 


2902 


1431 


3217 


5003 


6789 


784CIP2Ct 18 


290$ 


1432 


3218 


5004 


6790 


784CIP2C 19 


2948 


1433 


3219 


5005 


6791 


784CIP2C 20 


2956 


1434 


3220 


5006 


6792 


784CIP2C 21 


2959 


1435 


3221 


5007 


6793 


784CIP2C_22 


294$ 


1436 


3222 


5008 


6794 


784CIP2C 23 


2966 


1437 


3223 


5009 


6795 


784CIP2C 24 


2970 


1438 


3224 


5010 


6796 


784CIP2C 25 


2985 


1439 


3225 


5011 


6797 


784CIP2C 26 


29&7 


1440 


3226 


5012 


6798 


784CIP2C 27 


2993 


1441 


3227 


5013 


6799 


784CIP2C 2B 


2993 


1442 


3228 


5014 


6800 


784CIP2C 29 


3017 


14 4 3 


3229 


5015 


6801 


784CIP2C 30 


3046 


1444 


3230 


5016 


6802 


784CIP2C 31 


3050 


1445 


3231 


5017 


6803 


784CIP2C 32 


3357 


1446 


3232 


501B 


6804 


784CIP2C 33 


3359 


1447 


3233 


5019 


6805 


784CIP2C 34 


3432 


1443 


3234 


5020 


6806 


784CIP2C 35 


3438 


1449 


3235 


5021 


6807 


784CIP2C 36 


3439 


1450 


3236 


5022 


6 808 


784CIP2C 39 


1 3463 


1451 


3237 


5023 


6 B09 


784CIP2C 40 


3466 


1452 


3238 


5024 


6310 


784CIP2C 41 


3466 


1453 


3239 


5025 


6 911 


784CIP2C 42 


3467 


1454 


3240 


5026 


6812 


784CIP2C 43 


3468 


1455 


3241 


5027 


6813 


784CIP2C 44 


3483 


1456 


3242 


5028 


6 814 


784CIP2C 45 


3484 


1457 


3243 


5029 


6815 


784CIP2C 44 


3488 


1458 


3244 


5030 


6816 


784CIP2C 47 


3491 


1459 : 


3245 


5031 


6817 


784CIP2C 48 


3493 


1460 


3246 


5032 


6818 


784CIP2C 49 


3494 


1461 


3247 


5033 


4619 


784CIP2t 50 


3495 


1462 


5248 


5034 


6820 


784CIP2C 51 


3496 


1463 


3249 


5035 


6821 


7B4CIP2C 52 


3503 


1464 


3250 


5036 


4822 


784CIP2C 53 


3503 


1465 


3251 


5037 


6823 


784ClP2C_54 


3$04 


1466 


3252 


5038 


4824 


784CIP2C 55 


3S11 


1467 


3253 


5039 


6825 


784CIP2C 5.6 


3531 


1468 


3254 j 


5040 


6826 


784CIP2C_S7 


3536 


1469 


3255 


5041 


6827 


784CIP2C_5B 


354$ 


1470 


3256 


5042 


6828 


784CIP2C 59 


3548 


1471 


3257 


5043 


6829 


784CIP2C 60 


3551 


1472 


3258 


5044 


6830 


7B4CIP2C_61 


" 3553 


1473 


3259 


5045 


6831 


784CIP2C 62 


3564 


1474 


3260 \ 


5046 


6832 


784CIP2C 63 


3567 


1475 


3261 


5047 


6833 


784CIP2C_ > 64 


3572 


1476 


3262 


5048 


6834 


784CIP2C 65 


3573 


1477 


3263 


5049 


6835 


7 84CIP2C_66 


3 574 | 


1478 


3264 


5050 


6836 


784CIP2C 67 


3583 


1479 


3265 


5051 


6837 


784CIP2C_68 


3615 


1480 


3266 


5052 


6838 


784CIP2C 69 


3623 


1481 


3267 


5053 


6839 


7B4CIP2C_70 | 


3629 


1482 


3268 


5054 


6840 


784CIP2C 71 


3666 


1483 


3269 


5055 


6641 


784CIP2C 72 


3667 


1484 


3270 


5056 


6642 


784CIP2C 73 


3906 


1485 


3271 


5057 


6843 


784CtP2C 74- 


3912 
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1 spt> Kn- ' 
of full- 
length 
nucleotide 
sequence 


SEQ XD 
NO : of 
full- 
length 
peptide 
sequence 


oSU xLF E*\Jl 
r\yy t* 4 rr 

nucleotide 
sequence 


SEQ ID 

NO: 

of eonhia 

peptide 

sequence 


Priority 
docket number^ 

SEQ ID NO: in 

priority 

application 


SEQ ID 
wu : in 
U.S .S .N. 
09/488 . 725 


1486 


3272 


5058 


6844 


784CIP2C 75 


3924 


1487 


3273 


5059 


■"■ 6-845 


784CIP2C 76 


3928 


1488 


3274 


- 5060 


6846 


784CIP2C 77 


3935 


1489 


3275 


5061 


6847 


784CIP2C_78 


3559 


1490 


3276 


5062 


6848 


7B4CIP2C 79 


3981 


1491 


3277 


506"3 


6849 


784CIP2C 80 


3989 


1492 


3278 


5064 


6850 


784CIP2C 81 


4295 


1493 


3279 


5065 


6851 


784CIP2C 82 


4300 


1494 


3280 


5066 


6852 


784CIP2C 83 


4360 


1495 


3281 


5067 


6853 


784CIP2C 84 


4362 


1496 


3282 


5068 


" "6854 


7B4CIP2C B5 


4371 


• 1497 


3283 


5069 


6855 


7B4CIP2C 96 


4373 


1498 


3284 


5070 


6856 


7RAr*TP?r* R7 


d^7fi 
*j /© 


1499 


3285 


5071 


6857 


7R4CTP2P HQ 


4 3^8 


1500 


3286 


5072 


6858 




43 8 2 


1501 


3287 


5073 


6859 


7flAr , TP9f R1 


4409 


• 1502 


.3288 


5074 


6860 




4421 


1503 


32B9 


5075 


6861 


7R4CIP9C! 91 


4421 


" "15^4 


3290 


5076 


6862 


7R4PTP3C! QA 


a a o c 

rt H £> D 


1505 


3291 


5077 


6863 






1506 


3292 


5078 


6864 


7flAf , TP7r' Qfi 




1507 


3293 


5079 


6865 


TflinDir on 

/ CiLlrZL^J / 


A A T £ 


1508 


3 2 94 


5 080 


6 866 




A A *) Q 


1509 


3295 


5081 


6 867 






1510 


3296 


5082 


6868 




A A A 1 


1511 


3297 


5 083 


6869 


tr AfT D^r* 




1512 


3298 


5084 


6870 




AA C K 


1513 


3299 


5085 


6971 




A AC 3 


1514 


3300 


5086 






44 6 6 


1515 


3301 


5087 


6873 




AAA Q 
'i ft D J7 


1516 


3302 


5088 


6374 


7fldC!IP2C! lflK 


447") 


1517 


3303 


5089 


6 875 


f O^LXr^U Xu / 


A A OI 
440X 


1518 


3304 


5090 


6076 


7RAPTP*3P TOR 
/ D 1 t-X. r^L XUO 


4483 


1519 


3305 


5091 


6877 




4484 


1520 


3306 


5092 


6 878 


7B4PTDOP Tin. 
/O^LXrZL XXU 


A ARC 


1521 r 


3307 


" 5093" 


-* 6B79 


7ftAPIP2P 111 


* 4490 


1522 


3308 


5094 


6680 


f Oti <-XC4\* XX4 




1S23 


3309 


5095 


6881 


7ft4f*TP2C 11 "\ 




1524 


3310 


5096 


6882 


/ 0«ILXC<L XX4 




1525 


3311 


5097 


^883 


7B4CIP2C 11S 


4509 


152? 


3312 


5098 


6884 


7S4CIP2C 11 fi 


' * "4514 


1527 


3313 


5099 


6885 


7R4CIP2C n 7 


Am 

13XD 


1528 


3314 


5100 


6886 


784CIP2C 118 


4522 


1529 


3315 


5101 


"'£887 


784CIP2C 119 


4525T" 


1530 


3316 


5102 


6888 


7B4CIP2C 120 


4527 


1531 


3317 


5103 


6689 


7B4CIP2C 121 


4528 


1532 


3318 


5104 


6890 


7B4CIP2C 122 


4529 


1533 


3319 


S105 


6891 


784CIP2C 123 


4532 


1534 


3320 


5106 


6892 


784CIP2C 124 


4537 


1535 


3321 


5107 


6893 


7B4CIP2C 125 


4538 


1536 


3322 


5108 


6894 


784CIP2C 126 


4551 


1537 


3323 


5109 


6895 


7B4CIP2C 127" 


4552 


1538 


3324 


5110 


6896 


784CIP2C 128 


4559 


1539 


3325 


5111 


6897 


784CIP2C_129 


4567 


1540 


3326 


5112 


6898 


784CIP2CJL30 


4568 


1541 


3327 


5113 


6899 


784CIP2C_132 


4585 


1S42 


3328 


5114 


6900 


784CIP2C_133 


4592 


1543 


3329 


5115 


6901 


784CIP2C_134 


4609 


1544 


3330 


5116 


6902 


784CIP2C_JL35 


4616 


1545 


3331 


5117 


6903 


784CIP2C_136 


4617 


1546 


3332 


5118 


6904 


784CIP2C__137 


4618 


1547 


3333 " - 


S119 


6905 


784CIP2C 138 


4620 
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of full- 
length 
nucleotide 
sequence 


SEQ ID 

full- 
length 
peptide 
sequence 


S3Q ID NO: 
oe conilg 


SEQ ID 
NO : 

sequence 


Priority 
docket number_ 
cor re spondi ng 

^T?fi TTl Kin - {*« 

o«y xv wu. in 
appl i ca t i on 


SBQ ID 
NO: in 
U.S . S . N . 

OS/4oo , 725 


1548 


3334 


5120 


6906 


7B4CIP2C 139 


4624 


1549 


3335 


5121 


£907 


784CfP2C = 140 — 


"46T2 


1550 


3336 " 


5122 


6908 




4634 


1551 


3337 


5123 


6909 


784CIP2C 142 


4638 


1552 


3338 


5124 


6910 


784CIP2C 143 


4639 


1553 


3339 


5125 


6911 


784CIP~2C - 144 


4643 


1554 '■ 


3340" " 


5126" " 


6912 


784CIP2C 145 


4644 


1555 


3341 


5127 


6913 


784CIP3P ufi 


4£cc 


1556 


3342 


5128 


"6914 


784rTP9P 14*7 


400 0 


1557 


3343 


5129 


6915 




HO i / 


1558 


3344 


5130 


6916 






1559 


3345 


, 5131 




TO/lPTDir 1 en 


ach 
4o77 


1560 


3346 






/o4L.lr2v_ 152 


4682 


1561 


3347 


5133 




lt)4Lll! , /L 193 


a can 


1562 


3348 


5134 


coon 

Q7ZU 


/04CLP2C 154 


4691 


1563 


3349 






/84wlP2C 155 


4727 


1564 


3350 


5136 




/8 4ClP2C_15o 


4730 


1565 


3351 


JlJ / 


con 


/ 0 4 ul P2 u^JL 5 / 


4734 


f£66 




513 8 


6924 


/B4LIP2C 158 


4757 


1567 




5139 


6925 


784CIP2C_159 


4764 


"~1568 


1 *a C4 


5140 


6926 


784CIP2C - _160 


4786 


1 1569 


•a** cc 


5141 


6927 


78 4CIP2 0^161 


4793 


1570 


*a*}'cc 


5142 


6928 


784CIP2C 162 


4825 


1571 


OJ?/ 


5143 


6929 


784CIP2C 163 


4826 


1572 


3358 


Cl AA 


6930 


/84uJF2C_lo4 


4850 


1573 




5145 


6931 


784CIP2C_165 


4853 


1574 


3360 


5146 


6 932 


/B4CXP2C loo 


4855 


157£> 


3361 


Ol4 / 


6933 


/84C.IP2C 167 


4856 


• 1576^ 


33 62 


5148 


6934 


784CIP2C 168 


4867 


1577 




5149 


6935 


784CIP2C 169 


4869 


1578 




5150 


6 936 


784CIP2C 1 70 


4878 


1579 


3365 


ci Cl 
3lDl 


6937 


7B4CZP2C 171 


4880 


1580 


3366" 


Cl CO 


6 938 


/B4CAP2C 172 


4942 


1581 


J J o / 


5153 


6939 


784CIP2C 173 


4945 


1582 


3368 


Cl CA 


6940 


/84CIP2C 174 


4950 


1583 


JJt>3 


C1CC 

5155 


'6941 


7B4CIP2C 175- 


4952 


1584 


33 70 


Iff ife 1 

9J.9D 


6942 


7S4CIP2C^176 


4954 


1585 


3371 


Cl C7 






4958 


1586 


3372 


Cl CR. 




/04ulr2C 178 


4961 


1587 


3373 


5159 


05^9 


/U4Llr^L J. / if 


5590 


1S88" 


3374" 


5160 


6"946" 


/O^LlfiiL loU 


5599 


1589 


3375 


5161 


6947 


/oiLir^t, lol 


5692 


1590 


3376 


5162 




TOAPTIWP 1 00 
/OlUlr^L iO£ 


5732 


1591 


33 77 


5163 


6949 


7BAPTP7P 1 fll 


CTCC 

3 / 09 


" 1592 


3378 


5164 


£9S0 


*7B4CTP?g lftA 


c*n i 


1593 


33 79 


5165 


6951 


7ftAf , TD'3r 1 DC 




1594 


3380 


5166 


6952 


/ O "* V J. £T^.V» XQD 


C7Q-J 


1595 


33B1 


5167 


6953 


784CIP2C 187 


5606 


159£ 


3382 


5l6lT" 


6954 


7R4CIP2C IBS "" 


999^ 


1597 


3383 


5169 


6955 


7B4CTP2C 189 


5892 


1598 


3364 


5170 


6956 


7B4CIP2C 190 


6057 


1599 


3385 


5171 


6957 




6" 061 


1600 


3386 


BT72 


6958 




01U9 


1601 


3387 


5173 


6959 


784CIP2C 193 


6160 


1602 


3388 


5174 


6960 


784CIP2C 194 


6297 


1603 


3389 


5175 


6961 


784CIP2C195 - 


6398 


1604 


3390 


5176 


6962 


784CIP2C 196" 


6398 ■ 


1605 


3391 


5177 


6953 


784CIP2C 197 


6415 


1606 


3392 


5178 


6964 


784CIP2C 198 


6448 


1607 


3393 


5179 


6965 


784CIP2C 199 


4469 


1608 


3394 


5180 


6966 


7Q4CIP2C 200 


6^474 


1609 


3395 


5181 


6967 | 784CIP2C_201 


6561 
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SEQ ID NO; 
of full- 
length 
nucleotide 


SEQ ID 
NO r of 

EUJLx - 

length 

O V* W CHv C 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docks t numbe r__ 
c or r e spond i ng 
SEQ ID NO: in 
priority 
applies t ion 


SEQ ID 
WO: in 
U.S.S .N. 
09/48B , 725 


1610 


3396 


" 5182 " 


696B 


'o'iv.J. r^y \)C 


ce-7* 


1611 


3397 


51B3 


6969 


TRflrTDOr Oftl 


CC'l n 
OD / O 


1612 


3398 


5184 


6970 




Dbo2 


1613 


3399 


310 J 


C cin i 
OJ / X 




6672 


1614 


3400 


5186 




TflAPT DIP nnC 


bo3x 


1615 


3401 




oy /j 


IflirTDir *3ft*7 


ccqc 


16 16 1 


3402 


CI DO 






6746 


1617 


J 1 U J 






7B4CIP2C 209 


6B98 


1 CI Q 
loltJ 


Aft/1 


DX J?U 


6976 


7B4CIP2C_210 


6938 


T CI Q 




5191 


6977 


7B4CIP2C 211 


6943 




3406 


5192 


6978 


784CIP2C 212 


7110 




■■- 7 4iTt 

340 / 


5193 


6979 


784CIP2C 213 


7200 


i con 
lo2 2 


3408 


5194 


6980 


784CIP2C 214 


7212 


1623 


3409 


5195 


6981 


784CI?2C_215 


7218 


1624 


3410 


5196 


6982 


7B4CIP2C 216 


7249 


1625 


TSTn 

3411 


5197 


6983 


784CIP2C 217 


7500 


1626 


3412 


5198 


6984 


784CIP2C_218 


7509 


1627 


3413 


5199 


6985 


784CIP2C_219 


7523 


16 2 B 


3414 


5200 


6986 


7B4CIP2C_220 


7544 


1629 


3415 


5201 


6987 


784CIP2C_221 


7564 


1630 


3416 


5202 


6988 


784CIP2C_222 


7568 


1631 


3417 


5203 


6989 


7B4CIP2C_223 


7631 


1632 


3418 


5204 


6990 


784CIP2C_224 


7B13 


1633 


3419 


5205 


6991 


7B4CIP2C_225 


7831 


1634 


3420 


5206 


6992 


784CIP2C_226 


7843 


1635 


3421 


5207 


6993 


784CIP2C_227 


7907 


1636 


3422 


5208 


6994 


784CIP2C_22B 


7943 


1637 


3423 


5209 


6995 


784CIP2C 229 


B175 


1638 


3424 


5210 


6996 


784CIP2C_230 


8216 


1639 


3425 


5211 


6997 


784CIP2C_231 


8225 


1640 


3426 


5212 


6998 


784CIP2C_232 


8271 


1641 


3427 


5213 


6999 


784CIP2C_233 


8397 


1642 


3428 


5214 


7000 


784CIP2C 234 


8466 


1643 


3429 


5215 


7001 


784CIP2C_235 


8503 


1644 


3430 


5216 


7002 


784CIP2C_236 


8953 


1645 


3431 


5217 


7003 


784CIP2C_237 


9106 


1646 


3 432 


5218 


7004 


784CIP2C 238 


9139 


1647 


3433 


5219 


7005 


784CIP2C 239 


9555 


1648 


3434 


5220 


7006 


784CIP2C_240 


9650 




3435 


5221 


7007 


7B4CIP2C_241 


9889 






5222 


7008 


784CIP2C_242 


9933 


1 cci 




3437 


5223 


7009 


794CIP2C_243 


9953 


1 ceo 


343 B 


5224 


7010 


784CIP2C_244 


9981 


1 ccq 
X 0 3-3 


3 439 


5225 


7011 


784CIP2D 1 


746 


— 165 4 


-J **** u 


r>22b 


7012 


7B4CIP2D 2 


3558 


"1655" 




£>22 / 


7013 


784CIP2D 3 


3553 


1656 


•1449 




7014 


7B4CIP2D 4 


3633 


1657 






/UXD 


7B4CIP2D 5 


3658 


1658 


3444 


5230 


/ UXo 


/B4L.IP2D o 


J / 32 


1659 




?5Ti 


/Ul / 


7B4CIP2D_7 


4004 


1660 


3446 




7018 


7B4CIP2D 8 


4700 


1661 


3 447 






/B«*UxP2Li_9 


4703 


1662 




crJ7 


7020 


784CIP2D 10 


4774 


1663 


3449 


5235 


7021 


7B4CIP2D 11 


4894 


1664 


3450 


• 5236 


7022 


784CIP2D_12 


4918 


1665 


3451 


5237 


7023 


784CIP2D_13 


5159 


1666 


3452 


5238 " 


7024 


784CIP2D 14 


7443 


1667 


3453 


5239 


702 5 


784CIP2D 15 


8673 


1668 


3454 


5240 


7026 


784CIP2D_16 


8679 


1669 


3455 


5241 


702 7 


784CIP2D 17 


8727 


1670 


3456 


5242 


7028 


784CIP2D 18 


"8734 


1671 


3457 


5243 


7029 


784CIP2D_19 


8756 
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SBQ ID NO: 
of full- 
length 
nucleotide 
sequence 


" SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SBQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number 
corre spondi ng 
SEQ ID NO: in 
priori ty 
application 


NO: in 
U.S. S.N. 
09/4BB,725 


1672 


34S8 


5244 


7030 


7B4CIP2D 20 


8818 


1673 


3459 


5245 


7031 


784CIP2D 21 


8644 


1S74 


3460 


5246 


7032 


784CIP2D_22 


8846 


1675 


3461 


5247 


7033 


784CIP2D_23 


8912 


1676 


3462 


5248 


7034 


784CIP2D 24 


8918 


1677 


3463 


524 9 


7035 


784CIP2D_25 


8918 


1678 


3464 


5250 


7036 


7B4CIP2D 26 


8941 


1675 


3465 


5251 


7037 


784CIP2D_27 


8941 


16B0 


34*6 


5252 


7038 


784CIP2D 28 


8951 


1681 


3467 


5253 


7039 


784CIP2D 29 


8951 


1 16B2 


3468 


5254 


7040 


7B4CIP2D 30 


9007 


1683 


1 3469 


5255 


704.1 


784CIP2D 31 


9012 


1684 


3470 


5256 


7042 


784CIP2D 32 


9013 


1665 


3471 


5257 


7043 


784CIP2D 33 


9025 


1666 


3472 


5258 


7044 


784CIP2D 34 


9053 


1687 


3473 


5259 


7045 


7B4CIP2D 35 


9054' ~ 


16BB 


3474 


5260 


704* 


7B4CIP2D 36 


9054 


1689 


3475 


5261 


7047 


7B4CIP2D 37 


9113 


1690 


3476 


5262 


704 B 


784CIP2D 38 


9134 


1691 


3477 


5263 


7049 


784CIP2D 39 


9152 


1692 ™ 


3478 


5264 


7050 


784CIP2D 46 


9152 


1693 


3479 


5265 


70S1 


784CIP2D 41 


9211 


1*94 


3480 


5266 


7052 


784CIP2D 42 


9223 


1695 


34B1 


5267 


7053 


784CIP2D 43 


9223 


1696 


3482. 


5268 


7054 


784CIP2D 44 


9231 


1697 


3483 


5269 


7055 


7B4CIP2D 45 


9236 


1*98 


3484 


5270 


7056 


784CIP2D 46 


9236 i 


1699 


3485 


5271 


70S7 


784CIP2D 47 


9303 


1700 


3486 


[ 5272 


7058 


7B4CIP2D 48 


9309 


1701 


34B7 


5273 


7059 


784CIP2D 4 9 


9314 


1702 


3488 


5274 


7060 


784CIP2D 50 


9326 


1703 


3489 


5275 


7061 


784CIP2D 51 


9339 


1704 


3490 


5276 


7052 


784CIP2D 52 


9348 


1705 


3491 


5277 


7063 


784CIP2D 53 


937* " 


1?0* 


3492 


5278 


7064 


784CIP2D 54 


9382 


, 1707 


3493 


52^9 


7065 


784CIP2D 55 


9407 


1708 


3494 


52 BO 


7066 


7B4CIP2D 56 


9414 


1709 


3495 


. 5281 


7067 


784CtP2D 57 " 


9439 


1710 


349* 


5282 


7068 


784CIP2D 58 " 


9485 


1711 


3497 


5283 


7069 


784CIP2D 59 


94 93 


1712 


3498 


5284 


7070 " 


784C1P2D *0 


9S01 


1713 


3499 


5285 


7071 


784CIP2D 61 


952"* 


1714 


3500 


5286 


7072 


784CIP2D 62 


9526 


1715 


3501 


5287 


7073 


784CIP2D_63 


9551 


1716 


3502 


52BB 


7074 


7B4CI?2D_64 


" "9557 


1717 


3503 


5289 


7075 


784CIP2D_6 5 


9568 


1718 


3504 


S290 


7076 


784CIP2D 66 


9588 


1719 


3505 


5291 


7077 


7B4CIP2D 67 


9597 


1720 


3506 


5292 


7078 


784CIP2D 68 


9615 


1721 


3507 


5293 


7079 


7B4CIP2D_*9 


9628 


1722 


3508 


5294 


7080 


784CIP2D 70 


9649 


1723 


3509 


5295 


7081 


784CIP2D_71 


9652 


1724 


3510 


5296 


7082 


784CIP2D_J72 


9660 


1725 


. 3511 


5297 


7083 


784CIP2D_73 


9662 


1726 


3512 


529B 


7084 


784CIP2D_74 


9725 


1727 


3513 


5299 


7085 


784CIP2D75 


9746 


i72B 


3514 


5300 


7086 


784CIP2D_76 


9777 


1729 


3S15 


5301 


70B7 


784CIP2D 77 


9787 


1730 


3516 


5302 


7088 


784CIP2D 78 


9790 


1731 


3517 


5303 


7089 


784CIP2D 79 


9842 


1732 


3518 


5304 


7090 


784CIP2D 80 


9842 


1733 


3S19 


5305 


7091 


784CIP2D 81 


9848 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


correspond i no 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U . S . S N 
09/488 725 


1734 


3520 


5306 


7092 


784CIP2D_82 


9867 


1735 


3521 


5307 


7093 


7B4CIP2D 83 


. 10010 


1736 " 


3522 


5308 


7094 


784CIP2D 84 


10011 


1737 


3523 


5309 


7095 


7B4CIP2D 85 


10052 


1738 


3524 


5310 


70*6" 


784CIP2D 86 


10057 


1739 


3524 


5311 


7097 


784CIP2D 87 


10085 


1740 


3526 


5312 


7098 


784CIP2D 89 


10139 


1741 


3527 


5313 


7099 


784CIP2D_90 


1014T " 


1742 


352B 


5314 


7100 


784CIP2D 92 


10165 


1743 


3529 


5315 


7101 


784CIP2D 93 


10173 


1744 


3530 


5316 


7102 


784CIP2D 94 


10173 


1745 


3531 


5317 


7103 


784CIP2D 95 


10273 


1746 


3 532 


5318 


7104 


784CIP2E 1 


3121 


1747 


3533 


5319 


7105 


784CIP2E 2 


3628 


1748 


3 534 


5320 


7106 


7B4CIP2E_4 


JO rj 


1749 


3535 


5321 


7107 


7B4CIP2E_5 


** uxu 


1750 


3536 


5322 


7108 


784CrP2E_6' 


4467 


. 1751 


3537 


5323 


7109 


784CIP2E 7 


4 865 


1752 


3538 


5324 


7110 


784CIP2E 8 


*i 7X0 


1753 


3539 


5325 


7111 784CIP2E 9 




1754 


3540 


5326 


7112 


784CIP2E 16 


A Q*) £ 


1755 


3541 


5327 


7113 


784CIP2E_11 


4962 


175* ■ 


3512 


5328 


7114 


784CIP2E_12 


4963 


1757 


3543 


5329 


7115 


784CIP2E_13 


4964 ™ 


1758 


3544 


5330 


7116 


784CIP2E 14 


A O Q & 


1759 


3545 


5331 


7ll7 


7B4CIP2B 15 


CQ1C 


1760 


3546 


5332 


7118 


784CrP2E 16 


/ O O Z 


1761 


3547 


5333 


7119 


784CIP2E 17 


/ b o Z 


1762 


3548 


5334 


7120 


784CIP2E_18 


/ b ;f 7 


1763 


3549 


5335 


7121 


784CIP2E 19 


t f\i 1 


1764 


3550 


£336- 


7122 


784CIP2E 20 


mr\n 
I /u / 


1765 


" 3 551 


5337 


7123 


7B4CIP2E 21 


/ / — z 


1766 


3552 


533B 


7124 


784CIP2E 22 




1767 


3553 


5339 


7125 


784CIP2E 23 




1768 


3554 


" " £340 


7126 


784CIP2E 24 


9324 


1769 


3555 


5341r 


7127 


784CIP2F-1 


2976 


1770 


3556 


5342 


712B 


784CIP2F 2 


1CCQ 


1771 


3557 


5343 


7129 


784CIP2P 3 


4021 


1772 


3558 " 


£344 


7130 


784CIP2F 4 


AA*7A 


1773 


35S9 


5345 


7131 


784CIP2F 5 


4566 


1774 


3560 


S346 


7132 " 


784CIP2F 6 





1775 


3561 


5347 


7133 


784CIP2F 7 '■ 


" 4707 


1776 


3562 


5348 


7134 


784CIP2F 8 


4712 


1777 


3563 


5349 


7135 


784CIP2F 9 


5008 "~ 


1778 


3564 


5350 


7136 


784CIP2F_10 


5009 


1779 


3565 


5351 


7137 


784CIP2F 11 


sots" 


1780 


3566 


5352 


7138 


7B4CIP2? 12 


5015 


1781 


3567 


5353 


7139 


784CIP2F 13 


7724 


1782 


3568 ■ ' 


5354 


7140 


784CIP2F 14 


7725 


1783 


3569 


5355 


7141 


7B4CIP2F_15 


8828 


1784 


3570 


5356 


7142 


7B4CIP2F 16 


8830 


1785 


3571 


5357 


7143 


7B4CIP2F 17 


9739 


1786 


3572 


5358 


7144 


7B4CIP2F 18 


9896 
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TABLE 7 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G»Glycine, 
H«Histidine, I-Isoleucine, K-Lysine, 
L-Lcucine; M*» Methionine , NoAsparagine, 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y= Tyrosine, X= Unknown, *=Stop 
Codon. /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


"53*9 


337 


1131 


AH USARhSAL I L D E VA I L P APQNLS VL»S TNMKHLLMW S PVIAPG 
ETVYYSVEYQGEYBSLYTSHIWIPSSWCSLTEGP3C3)VTDDITA 
TVPYNLRVRATLGSQTS/CLEHP/VSIPLIBTQPSLPDL/RMEI 
TKDGFHLVIELEDLGPQFEFLVAYWRREPGAEEHVKMVRSGGIP 
VHLETME PGAAY CVKAQTFVKA I G RYS AFS QT ECVEVQGE AI PL 
VLAL FAFVG PML I LVVVPLFVWKMGRLLQ/ YLLLPRGGSSQTPW 
KITQP 


5360 


2 


1115 


PRVRSSGGQBDPASQQWARPRPTQPSKMRRRVIARPVGSSVRLK 
CVASGHPRPDITWMKDDQALTRPEAAEPRKKKWTIiSLKNLRPED 
SGKYTCRVSNRAGAIKATYKVDVIQRTRSKPVIjTGTHPVNTTVD 
FGGTT S FQCKVRS DVKP V I QWLKRVE YGAEGRHNS TI D VUG Q KF 
WL PTGDVWSRPDGS YLN FCLLITRARQDDAGMYICLGANTMGYS 
FRSAFliTVLPDPX PPG PPVASSSSATS L PW P WI GI PAGAVFIL 
GTLLLWLCQAQ KKPCTPAP APPLPGHR? PGTARDRSGDKDLP SL 
AALSAGPGVGLCEEHGSPAAPQHLIiGPGPVAGPKLYPKLYTGHS 
TPHTYTHPPPSCQLNSSHS 


5361 


3 


925 


HEGS I S SAN ILLDDQFQPKI/TDFAMAHFRSHIiEHQ SCTINMT S S 
SSKKLWYMPEBYIRQGKLSIKTDVYSFGIVIMEVLTGCRVVIjDD 
PKHIQLRDLLRELMEKRGLDSCLSFLDKKVPPCPRNFSAKLFCL 

agrcaatraklrpsmdevlntlbstqaslyfaedpptslks PRC 

PSPLFLENVPSIPVEDDBSONNNLLPSDEGLRIDRMTQKTPFEC 
SQSEVMFXiSLDKKPESKRNEEACNMPSSSCEESWFPKYIVPSQD 
LRPYKVNIDPSSEAPGHSCRSRPVESSCSSKFSWDEYEQYKKE 


'5352 

V 


2 


4879 


SCQVEGCTRT YNSS QS IG KHMKTAH PDQ YAAFKMQRKS KKGQKA 
NNLKTPNNGKFVYFLPSPVNSSNPFFTSQTKANGNPACSAQLQH 
VSPPIFPAHLASVSTPLLSSMESVINPNITSQDKN3QGGMLCSQ 
MENLPSTALPAQMEDLTKTVLPLNIDRGSDPFLS LPAESS S IDL 
FPSPADSGTNSVFSQLENNTNHYSSQIEGNTNSSFLKGGNGENA 
VFPSQVNVANKFSSTNAQQSAPEKVKKDRGRGQTGKERKPKHKTK 
RAKWPAI IRDGKFI CSRCYRAFTNPRSLGGHLS KRS YCKPLDGA 
EIAQBLLQSNGQPSI*LASMILSTNAVNLQQPC»S?FNPEACFKD 
PSFIiQLLAENRSPAFLPNTFPRSGVTNFNTSVSQEGSEII IQAIj 
ETAG I P STFEGABMLSHVSTGCVSDAS QVNATVMPNPTVPPLLH 
TVCHPNTLLTNQNRTSNS KTSS IEECSSLPVFPTND&LLKTVEN 
GLCSSSFPNSGGPSQ2S7F^SNSSRVSVISGP0NTRSSHLNKKGNS 
AS KRRKKVAPPLIAPNASQNLVTSDLTTMGLIAKSVEIPTTNLH 
SNVI PTCEPQS LVEI^TQKIJWVNNQLFMTDVKENFKTSLESHT 
VLAPLTLKTENGDSQMMAIiNSCTTSVNSDriQISBDNVIQNFEKT 
LE I IKTAMNSQ I LEVKSGS QGAGETSQNAQINYNIQLPSVNT VQ 
NNKLPDSSP\FS3FISVMPTESNIPQSE\VSHKBDQIQEILEGL 
QKLKLENDLSTPA5QCVLINTSVTLTPTPVKSTADITVIQPVS2 
MINIQFNDKVNKPFVCQNQGCNYSAMTjEaJALFKHYGKIHQYTPE 
M I L E I KKNQLKFAP F KC W PTCTKTFTRNS NIiRAH CQ L VHH FTT 
EEMVKLKIKRPYGRKSQSENVPASRSTQVKKQLAMTEENKKESQ 
PALEXRAETQNTHSNVAVI PEKQLI EKKS PDKTES SLQ VITVTS 
EQCNTNALTNTQTKGRKZRRHKKBKEEKKRKKPVSQSLEFPTRY 
SPYRPYRCVHQGCFAAFTIQQNLILHYQAVHKSDLPAFSAEVEE 
ESEAGKE S EETETKQTLKEFRCQVSDCSR I FQAI TGL IQHYMKL 
HEMTPEEIESMTA5VT»VGKFPCDQLEC!CSSFTTYLNYV^LEAD 
KGIGIJIASKTEEI)G VYKQ3 CEGCDRI YATR5NLLRKI FNKHNDK 
HKAHLIRPRRLTPGQENMSSKANQEKSKSKHRGTKHSRCGKEGI 
KMPKTKRKKKNNLEIJKNAKIVQIEEN^YSLKRGKHVYSIKARN 
DALSECTSRFVTQYPCMIKGCTSWTSESNIIRHYKCHKLSKAF 
TSQHRNLL I VFKR CCNSQVKETS EQEGAKND VKDSDTCVSESND 
NSRTTATVSQ KE VBKNE* DEMDELTED FITKLINEDSTS VETQA 
NTS SNVSNDFQEDNL CQS ER Q KAS ML KRVNXEKNVS QNXKRKVE 
KAE PASAAELSSVRKEEETAVAIQTI EBHPAS FDWSSFKPMGFE 
VS FLKFLEESAVKQKKNTDKDH PNTGNKKGSHSNSRKN1DKTAV 
TSGNHVCPCKESETFVQFANPSQLQCSDNVKIVLDKNLKDCTEL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


J Amino acid segment containing signal peptide 
(A-Alanine, C«Cysteine, D-Aspartic Acid, En 
Glutamic Acid, F« Phenyl alanine, GsGlycine, 
H=Histidine, Iolsoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine , R=Arginine, 
S« Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrcsine, x=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VLKQIiQEMKPTVSLKKLBVHSNDPDMSVMXDISIGKATGRGQV 


5363 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCI PSVPP PVPFPTLWP 
PPSWRRQPPGGIRRDFSRRIjimEANLVATCLPVRASIiPHRLNML 
ROPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
Q S K PG CYDNG KHYQ I NQ QWERTY LGNAL VCTCYGG S RG FNCE S K 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGR I S 
CTIANRCHEGGQSYKIGDTWRRPHETtSQYMLECVCLGNGKGBWT 
CKPIAEKCFDHAAGTSYVVGETWEKPYQGWMMVDCTCIiGEGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNLLQCICTGNGRG 
EWKCERHTS VQTTS SG SG PFTDVRAAVYQPQPH PQ P PP YGHCVT 
DSGWYSVGMQLA* KTQGNKQML\CTCLGNGVSCQETAVTQTYG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYS FCTDHTVLVQTRGGNSNGALCHFPFXYNNHNYTDCTSEGRR 
DNMKWCXTTQNYDADQKFGFCPMAAHEE ICTTNEGVMYRIGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DT FHKRH EEGHMLNCTC FGQGRGRWKCD PVDQCQDS ETGTFYQ I 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLOTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNSYTIKGLICPG WYEGQLIS IQQ YGHQEVTRFDFTTTSTST 
PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 
SGFRVEYELSEEGDEPQYIiVLPSTATSV\NlP\DLLPGRKYIVN 
VYQI SEDGEQSLIIiSTSQTTAPDAPPDPTVDQVDDTS Z WRWSR 
PQAPITGYRIVYSPSVEGS3TELNLPETANSVTLSDLQPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 
KVTIMWTP PES AVTGYRVDVI PVNLPGEHGQRLPLS RNTF\ ABK 
TGLS PGVT YY F KV FAVS HGRES KPLTAQQTT KL \ DAPTNLQ F VN 
B TD S TVL VRW T P PRAQ I TG YRLTVGLTRRGQ PRQ YNVG P S VS KY 
PLRNLQPAS EYTVSLVAI KGNQES PKATGVFTTLQPGSS IPPYN 
TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGLTPGVEYVYTIQVLRDGQBRDAP\rVNK\VVTPLSPPTNLH 
LEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVV 
HADQS SCTF \DNLE VPGL E YNVS VYT VKDDKESVP I SDTI I PAV 
P P PTDLR FTN/ 1 LGPDTMRVTW \ APP PS I DLTNFLVRYS P VKNE 
GRMLQSLS 1 FFLSDN\AWLTNLIiPGTEYWS VSSVYEQHESTP 
\LRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\ IAPRA/TP I 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNS ITLTNLTPGTEYW 
S IVALNGREES P LI> IG QQSTVSD VPRDLE WAAT PTS LL I \ S WD 
APAVTVRY YR IT YGETGGNS FVQ E FTVPGSKSTAT I SGLKPGVD 
YTITVYAVTGRGDS PASS KPISINYRTE I DKPSQMQVTD VQDNS 
lSVKWIiPSSSPVTGYRVTTT\PKNGPG\PTICrJCTAGPDQTEMTI 
EGLQPTVE YVV3 VYAQNPSGESQ PLVQTAVTN"I DRPKGLAFTD V 
DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLRPGSF^TVSWALHDDMESQPLIGTQSTAIPAPTDLiKPT 
QVTPTSLSAQOTPPWQLTGYRVRVTPKEKTGPMKEINLAPDSS 
S VVVSGLMVATKYE VSVYAIiKDTLTS RPAQG WTTLBNVS PPRR 
AR VTDATETT I TIS WRTKTETI TG FQVDAVPANGQT P I QRTIKP 
DVRSYTITGLQPGTDYKIYLYTIiNDNARSSPWIDASTAIDAPS 
NliRFLATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 
RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNLHGPE ILDVPSTVQ KTP FVTHPG YDTGNG IQLPGT 
SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTTIS WAPFQDTS E YI ISCHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGATYNI 1 VEAIiKDQX2RH KVREEVVTVGNSVNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCIiGFGSGHFRCD 
SSRWaiDNGVWYKIGEKTORQGENGOMMSCTCLGNGKGEFK^ 
REATC YDDG KTYHVGE QWQKE YLGAI CSCTC FGGQRGWRCDNCR 
RPGGEPSPBGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 
ADREDSRE 


5364 


8066 


703 


RIiCCTGCGBGTPGASGKRGPAATTSLVIiCIPSVPPPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRRBANLVATCIjPVRASLPHRIjNMI* 
RGPGPGLIjLJAVLCIiGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
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SEQ 
xu 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
lA=Alanine, OCysteine, DsAspartic Acid, E= 
Glutamic Acid, F=»Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N**Asparagine , 
Pa Proline, Q-Glutamine, R»Arginine, 
S»Serine, T=»Threonine, VwValine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QSKPGCYDNGKHYQINQQWERTYLGNALVCTCYGGSKGFNCESK 
P EAEETCFDKYTGNTYRVGDTY BR PXDSMI WDCTCXGAGRGR I S 
CTIANRCHEGGOSYKIGDTWRRPHETGGYMLBCVCXGNGKGEIVfT 
CKP I ABKC PDHAAGTS YWGBTWEKP YQGWMMVDCTCLGEGSGR 
I TCTS RNRCNDQDTRTS YRIGDTWS K.KDNRGNLLQC I CTGNGRG 
EWKCERHTS VQTTS SGS GP FTDVRAAVYQ PQPHPQP PP YGHCVT 
DSGWYSVGMQLA* KTQGNKQML\CTCLGNGVS CQETAVTQrYG 
GWSMGEPCVLPPTYNGRTFYSCTTBGRQDGHLWCSTTSNYEQDQ 
KYS FCTDHTVLVQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 
DNMKWOGTTQN YDADQKFG PCPMAAH3E I CTTNEGVM YR IGDQW 
DKQHD^HMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDOCQDSETGTFYQI 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP- 
GHLN S YT I KGLXPG WYEGQLI S IQQYGHQEVTR PDPTTTSTST 
PVTSNT\VTGKTTPPS PLVATSESVTEITASS FW5WVSASDTV 
SGFRVEYELSEEGDBPQYLVLPSTATSV\NIP\DLLPGRKYIVN 
VYQ IS EDGEQSLILSTSQTTAPDAPPDPTVDQVDDTSI WRWSR 
PQAPITGYRlVYSPSVEGSSTBLNLPETANSVTIjSDLQPGVQyN 
IT I YAVEENQESTP WI QQBTTGTPRSDT VPS PRDLQFVE VTDV 
KVTIMWTPPESAVTGYRVDVIPVNLPGEHGQRLPI*SRNTP\AEN 
TGLS PGVTYYFXVFAVSHGRESKPLTA(XJTTKL\DAPTNI»QFVN 
ETDS TVLVRWTP PRAQ I TG YRLTVGLTRRGQ PRQ YNVGPS VS KY 
PlJtNLQP ASE YWSLVAI KGl^ESPKATGVFTTLQPGS S I PP YN 
TBVTETTIVITWTPAPRIGFKLGVRPSQGGBAPREVTSDSGS IV 
VSGLTPGVEYVYTIQVIiRDGQERDAP\lVNK\VVTPLSPPTN£iH 
LEANPDTGVLTVSMERSTTPDITGYRITTTPTNGQQGNSIiEEVV 
HADQS S CTF\ DNLEVPGLE Y53VS VYTVKDDKESVP I SDT 1 1 PAV 
PPPTDLRFTN/ILGPDTMRVTW\APPPSIDLTNFLVRYSPVKNE 
GRMLQSLS IFFLSDN\A WLTNLLPGTEYWS VSS VYEQHES TP 
\LRGRQKTGLDSP\TGIDFS\DITA\USFT\VHW\IAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITI*TnLTPGTEYW 
SIVALNGREESPLLIGQQSTVSDVPRDIiEWAATPTSIiLl\SWD 
APAVT VR YYR I TYGETGGNS PVQEFTVPGSKS TATISGLKPG VD 
YTI TVYAVTGRGDS PASSKPISINYRTE IDXPSQMQVTDVQDNS 
ISVKWLPSSSPVTGYRVTTT\PKNGPG\PTKTKTAGPDQTEMTI 
EGLQPTVEYWSVYAQNP SGESQPLVQTAVTNIDRP KGIiAFTDV 
DVDS I KI AWES PQGQ VSRYRVTYSSP EDGIHELFPAPDGEE DTA 
ELQGLR PGSEYTVS WALHDDMES QPL I GTQS TAJ PAPTDL KFT 
QVT PTS LS AQWTP PNVQLTG YRVRVT PKEKTGPMKE I NLAPDS S 
S VWSG LMVAT KYEVS VYALK0TLTS RPAQGWTTLENVS P PRR 
ARVTDATETTI T I S HRTKTET I TGFQVDAVPANGQTP I QRTI K P 
DVRS YT ITGLQ PGTDYKI YLYTLNDNARS S PWIDAS TAIDAPS 
NLRFLATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 
RPGVTE ATITGLE PGTEYT I YVIALKNNQKSBPL IG R KKTDELP 
QLVTLPHPNLHGPEILDVPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTT I SWAP FQDTSEY 1 1 S CH PVGTDEEPIjQFRVPGTSTSAT 
LTGLTRGATYNlIVEALKDQQRHKyREEVVTVGNSVNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 

ssrwchdngvnykigekwdrqgengc^sctclgngkgefkcdp 
heatcyddgktyhvgeqwqkeylgaicsctcfggqrgwrcdncr 

adredsre 


5365 


80££ 


703 


RIXCTGGGEGTPGASGKRGPAATTSIjVLCIPSVPPPVPFPTIiWP 
PPSWRRQPPGGiRROFSRRLRREANLVATCLPVRASLPHRLNML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
QSKPGCYDNGKHYOINQQWBRTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQS YKIGDTWRRPHETGG YMLECVCIiGNGKGE WT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
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ID 
HO: 


"Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid; E= 
Glutamic Acid, F*= Phenylalanine, G=Glycine, I 
H=Histidine, I=Isoleucine, K«Lysine, 
LaLeucine, M=Methianine, N»Asparagine, 
P=Proline, Q=Glutamine, RsArginine, 
S=Serine, T=Threonine, V-Valine, 
W«Tryptophan, Y= Tyrosine, X= Unknown, *«Stop 
Codon, /-.possible nucleotide deletion, 
\opossible nucleotide insertion) 






r 


I TCTSRNRCNDQDTRTS YR IGDTWS KKDNRGNLLQCI C^TGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYS VGMQLA+ KTQGNKQML\CTCLGNGVS CQ B T A VTQTYG 
GNSNGEPCVLP FTYNGRTFYS CTTEGRQDGHLW CSTTSN YEQDQ 
KYSFCTDHrVLVQTRGGNSNGALCHFPFLYNNHJiYTDCTSEGRR 
DNMKWCGTTQNyDADQOTGFCPMAAHEEICTTNEGVMYRIGDOW 
DKQHD^HMMRCTCVGNGRG^TCIAYSQLRDQCIVDDITYNVW 
DTFHKRHBBGHMUJ5CTCFGQGRGRWKCDPVDQCQDSETGTPYQI 
GUSWEKYVHGVRYQCYCYGRGIGEMHCQPLQTYPSSSGPVBVPI 
TKTPSQPNSHPIQMNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNS YT I KGLXPGWYEGQLI S I QQYGHQEVTR FDFTTTS TST 
PVTSNT \ VTGETTPFS PL VATS ES VTE ITAS 3 F WSWVS AS DTV 
SGPR VEYELSEEGDEPQYLVLPSTATS V\NIP \ DLLPGRKYI VN 
VYQ ISEDGEQSLILSTSQTTAPDAPPDPTVDQVDDTS I WRWSR 
PQAPI TGYR I VYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 
I TI YAVEBNQES TPWIQQETTGTPRSDT VPS PRDLQ FVEVTD V 
KVT I MWTP PE SAVTGYR VDVT P VNL PG EHGQRL PLSRNTF\AEN 
TGLS PGVTYYFKVFAVS HGRESKP LTAQQTTKL \ DAPTNLQ FVN 
ETDSTVLVRWTP PRAQITG YRLTVGLTRRGQPRQYNVGPS VS KY 
PLRNLQ PAS E YTVSLVAI KGMQES P KATGVFTTLQ PGS S IPPYN 
TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGS IV 
VSGLTPGVEYVYTIQVLRDGQERDAP \ I VNK\WTPLSP PTNLU 
LEANPDTGVLTVS WERSTTPDI TG YR I TTT PTNGQQGNSLEE W 
HADQS SCTF\ DNLRVPGLEYNVS VYTVKDDKES VPISDTI I PAV 
PPPTDLRFTN/ ILGPDTMRVTW\APPP S IDLTNFLVRYS PVKNE 
GRMLQS LS IPFLSDN\AWLrNLLPGT3YWSVSSVYEQHESTP 
\LRGRQKTGLDSP\TGIDFS \ DITA\NS FT\ VHW\ IAPRA/TPI 
TG YRI R\HHP EH F\ SGR PR EDR\VPHS RNSI TLTNLTPGTEYW 
SIVALNGREESPLLIGQQSTVSDVPRDIiEWAATPTSLIil \SWD 
APAVTVRYYR I T YGETGGNS P VQE FTVPGSKSTAT I SGL KPG VD 
YT I TV YAVTGRGDS PASS KPI SINYRTEIDK PSQMQVTDVQDNS 
ISVKWLPSSSPVTGYRVTTT\ PKNGPG \ PTKTKTAG PDQTEMT I 
EGLQ PTVE YVVS VYAQNP SGESQPLVQTAVTTJ IDRP KGLAFTDV 
DVDSIKIAWBSPQGQVSftYRVTYSSPEDGIHELFPAPDGEEDTA 
EUX3LRPGSEYTVSVVALHDDMESQPLIGTQSTAZPAPTDLKFT 
QVTPTS LSAQWTPPNVQLTGYRVRVTP KEKTGPMKE INLAPDS S 
S WVS G LMVAT KYEVS VYALKDTLTS R PAQG WTTL ENVS P P RR 
ARVTDATETTITISWRTKTETITGFQVDAVPANGQTP IQRTIKP 
DVRSYTITGLQPGTDYKI YLYTLNDNARSSPWI DAS TAX DAPS 
NLR FLATTPNS LLVSWQP PRARI TG YI I KYEKPGS P PRE WPRP 
RPGVTEATITGLEPGTEYriYVIALKNNQKSEPLIGRKKTDELP 
QLVTLPHPNLHGPE I LDVPS 7VQ KTPFVTH PGYDTGNG I Q L PG T 
SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTTI3WAPFQDTSEYIISCHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGATYN 1 1 VEALKDQQRHKVRBEVVTVGNSVNEGLNQPT 
DDSCFDPYTVSK YAVGDEWERMS BSGF KLLCQCLG FGS GHFRCD 
S S RWCHDNG VNY KIGE KWDRQGEKGQMMS CTCLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGEPS PEGTTGQS YNQYSQRYHQRTNTNVNCP IECFMPLDVQ 
ADREDSRB 


$3C6 


8066 


703 


RLCCTGGGEGTPGAS GKRGPAATTS LVLC I PSVPPPVPFPTLWP " 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRLNML 
RG PGPGLLLLAVLCLGTAVPSTGAS KS KRQAQQMVQPQS P VAVS 
QSKPGCYDNGKHYQINQQWERTYLGNALVCTCYGGSRGFNCESK 
P EAEET CF DKYTGNTYRVGDTYE RP KDS M I WDCTC I GAG RGR I S 
CTIANRCHEGGQSYKIGDTVJRRPHETGGY^ECVCLGNGKGEWT 
CKPI AEKCFDHAAGTS YWGETWEKP YQG WtWVDCTCLG EGSGR 
I TCTSRNRCNDQDTRTS YRIGDTWSKKDNRGNLLQCI CTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSG WYS VGMQ LA* KTQG NKQML \ CTCLGNGVS CQETAVTQTYG 
GNSNGB PCVLP FT YNGRTPYS CTTEGRQDGHLWCSTTSNYEQDQ 
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ID 
.HO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
lA-Hianine, c=cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F=» Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, lULysine, 
LeLeucine, Methionine, N=Asparagine, ! 
PaProline, Q^Glutamine, R*Arginine, 
S«Serine, T»Threonine, V*Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *»Stop 
Codon, /npossible nucleotide deletion, 
\cposslble nucleotide insertion) 








KY S FCTDH T VL VQT RGGN S NG ALCH I? P F L YNNHNY TD CTS EG RR 
DNMKWCGTTQNYDADQKFGFCPMAAHEE I CTTNEGVMYRIGDQW 
DKQHDMGHMMRCTCVGNGRGBWTCIAYSQXiRDQCIVDDITYNVN 
DTFHKRHBEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQ I 
GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 
TBTPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNS YTI XGLKPG VVYEGQL I S I QQYGHQEVTRPDFTTT STS T 
PVTSNT\VTGETTPPS PLVATSES VTEI TASSFWSWVSASDTV 
SGFRVEYELSEEGDEPQYLVLPSTATSV\NIP\DLLPGRKYTVN 
VYQ I SEDGEQSLI LSTSQTTAPDAPPDPTVDQVDDTS I WRWSR 
PQAPrTGYRIVYSPSVEGSSTELNLPETANSVTLSDLOPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDIiQFVEVTDV 
KVTI MWTPPES AVTG YRVDVI PVNLPGEHGQRIiPL»SRNTF\aEN 
XGLS PGVT YYFKVFAVSHGRES KPLTAQQTTKL \ DAPTNLQFVN 
ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 
PLRNLQ PAS E YTVSLVAI XGNQES PKATGVFTTLQPGSSIPPYN 
TB VTBTTI VI TWTPAPRI G FKLG VRP S QGG E APREVTSDSGS IV 
VSGI/TPGVE YVYTIQVLRDGQERDAP \ I VNK\ WTPIiSPPTNLH 
LEANPDTG VTiTVS HERS TT PD ITG YR ITTT PTNGQQGNSLEB W 
I IADQS S CTF \ DNLEVPGLE YNVS VYTVKDDXESVPISDTI I PAV 
PP PTDLRFTN/ II*GPDTMRVTW \ AP PPS I DLTNFLVR YS P VKNE 
GRMLQSLS IFFLSDN\AWLTNLLPGTEYWSVSSVYEQHESTP 
\LRGRQKTGLDSP \TGIDFS \DITA\NSFT\VHW \IAPRA/TPI 
TGYRIR\HHPBHF\SGRPREDR\VPHSRNS ITLTNLTPGTEYW 
SIVALNGREBSPLLIGQQSTVSDVPRDLEVVAATPTSLLI\SWD 
APAVTVR YYR I TYGETGGNSP VQEFTVPGS KSTATISGLKPG VD 
YTITVYAVTGRGDS PASS KP I S IN YRTE IDKPS QMQVTDVQDNS 
ISVKWLPSSSPVTGYRVTTT\PKNGPG\PTKTKTAGPDOTEMTI 
EGLQPTVE YWSVYAQNPSGESQPLVQTAVTN I DRPKGLAFTD V 
DVDS I KIAWES PQGQVSRYRVTYS S P EDGIHBLF PAPDGE EDTA 
EliO^LRPGSBYTVSVVALHDDMESQPLIGTQSTAI PAPTDLKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
S WVSGLMVATKYEVSVYALKDTLTS RPAQGVVTTLENVS PPRR 
ARVTDATETT I TI\S WRTKTE TI TG FQ VD AVP ANG QTP I QRT I KP 
DVRSYT I TGLQPGTDYK I YLYTLNDNARSS P WI DASTAI DAPS 
NLRFLATTPNSLLVSWQPPRARITGYI I KYEKPGSPPREWPRP 
RPGVTEATITGIiEPGTBXTIYVIALKNNQKSEPLIGRKKTDEIiP 
QI1VTLPHPNI1HGPE ILDVPSTVQKTP FVTHPG YDTGNG I QLPGT 
SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTTISWAPFQDTSEYI IS CHPVGTDEEPLQFR VPGTS TSAT 
LTGLTRGATYNI IVEALKDQQRHKVREE VVTVGNS VNEGLNQP T 
DDSCFDPYTVSHYAVGDEW E RMSESGFKLLCQCLGFGSGHFRCD 
SSRWCHDKGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 
HEATCYDDG KTYHVGEQWQKE YLGAI CS CTCFGGQRGWRCDNCR 
RPGGE PS PEGTTGQS YNQYSQRYHQRTNTNVNCP I E CFMPIiDVQ 
ADR EDS RE 


5367 


23 5 


3591 


KKI LNML C KKN IV I E YLAD 1 L YE YLYG FCFSGI KKYL 1 1 HVLRL " 
ILELWMTRIxLLEKSVSLQTQYLLLIVKILSWPPGKEMRHHLQIM 
E VMMRKQDS / RIVONGSEQQLQKEIiADVLMDPPMDDQ PGEKELV 
KRS QLDGEGDGPLSNQLS AS ST I NP VP LVGLOKPEMSLP VKPG 0 
GDS E AS S P FTP VADEDS WFS KLTYLG CAS VNAPRS E VEALRMM 
SILRSQCQISLDVJTLSVPNVSEGIVRLLDPQTNTEIANYPIYKI 
LFCVRGHDGTPESDCFAFTES H YNAEL FRI HVFRCE I QEAVSR I 
LYSFATAFRRSAKQTPLSATAAPQTPDSDIFTFSVSLEIKEDDG 
KG YFSAVPKDKDRQCFKLRG^ I DKKIV I YVQQTTNKE LA I ERCF 
GLLI^PGKDVRNSDMHIjLDLESMGKSSDGKSYVTTGSWNPKSPH 
FQVVNEETP KDKVIiFMTTAVDLVITEVQEPVRFIiIiETKVRVCS P 
NE RLFWP FS KRSTTENFFLKLKQI KQRERKNNTDTlj YE VVCLBS 
ESERERRKTTASPSVRLPQSGSQSSVIPSPPEDDEBEDNDEPLL 
SGSGDVSKECAEKILETWGELLSKWHLNLNVRPKQLSSLVRNGV 
PEALRGEVWQIJJ^CHNNDHLVEKYRILITKESPQDSAITRDIN 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
co rr e spon d i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E« 
Glutamic Acid, F«Phenyl alanine, G=Glycine, 
HoHistidine, I«Isoleucine, K«Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTPPAHDYFKDTGGDGQDSLYKICKAYSVYDEEIGYCQGQSFIiA 
AVLLLHMPEEQAFSVLVKIMFDYGIjRELFKQNFEDLHCKFYQLE 
RLMQEYIPDLYNHFLDISLEAHMYASQWFLTLFTAKFPLYMVFH 
IIDIiLLCEGISVTFNVALGLIiKTSKDDLIjIiTDFEGALKFFRVQL 
PKHYRSEENAKKLMELACNMKISQKKLKKYEKEYHTMREQQAQQ 
ED P I ERFE11ENSRLQEANMRLBQENDDLAHELVTS KIALRKDLD 
^ESKADALNK3LLMTKQKLIDAEEEKRRLEEESAHLKKMCRKE 
LDKAESEIKKNSSIIGDYKQICSQIiSERLEKQQTANKVEIEKIR 
QKVDDCBRCREFFNKEGRVKG 1 SSTKEVLDEDTDEBKETLKNQL 
REMELELAOTKL\QLVEAECK2QD\ljEHPF*GLPFNE\VQAA\K 
KTWFNRTLSS 1 KTATGVQG KETC 


53SB 


573 


2014 


GAAAG AADP RRG S LGGRTMLDFA t FAWFiiLAiA/GAVLVLYPAS 
RQAAGIPG ITPTEEKDGNLPD1 VNSGSLHEFLVNLHERYG PWS 
FWFGRRLWSI^TVDVLKQHINPNKTI^D/LF^NHAEVIIKVSrw 
WWQCE*KP\ORKKLYENGVTDSLXSNFAIiIiLKLPBELLDKWI*SY 
PBTQH\VPLSQHMLGFAMKSVTQMVMGSTFEDDQEVIRFQKNHG 
TVWS E IGKX3 FLDGSLDKKMTR K KQ Y ED ALMQLE S VLRN 1 1 KERK 
GRNFSQHIFIDSLVQGNLNDQQILED3M1 FSLASCI ITAKLCTW 
A I WFLTTSEEVQKKL YEE INQ VFGNG PVTPE KI EQLR YCQHVLC 
ETVRTAKLTPVSAQLQDIEGKIDRFI IPRETLVLYALGWLQDP 
NTWPSPHKFDPDRFDDELVMKTFSSLGFSGTQECPELRFAYMVT 
TVLLSVIiVKRLHLLSVEGQVIETKYELVTSSREEAWITVSKRY 


5369 

•T 


1 ~l 


6622 


PRSLCFSLWAEAAVLADGGLRRRRRLIiRGTMS AS FVPNGASLED 
CHCNLFCLADLTG I KWKKYVWQG PTSAPI LFPVTEEDP ILS SFS 
RCIiKADVLG/VWRRDQRPERRE\L*IFWGGEDP\VLLTLFTMTY 
QKKKMECGRMDF PMNAVLCFSKAVHNLLERCLMNRNFVR I G KWF 
VKPYEKDEKP INKSEHLSCS FTFFLHGDSNVCTSVE 1NQHQPVY 
LLSEEHlTIiAQQSNSPFQVirjCPFGLNGTLTGQAFKMSDSATKK 
LIGEWKQFYP ISCCLKEMSEEKQEDMDWEDDSLAAVEVLVAGVR 
MI YPACFVLVPQSDIPTPS PVGSTHCS SSCLGVHQVP AS TRDPA 
MS SVTLTPPTS PEEVQTVD PQSVQKWVKFSS VSDGFNSDS TS HH 
GGKI PRKLANHVVDRVWQECNMNRAQNKRKYSASS GGLCEEATA 
AKVASWDFVEATQRTNC3 CLRHKNLKS RNAG QQGQAPS LGQQQQ - 

ilpkhktnekqeksekpqkrpltpfhhrvsvsddvgmd\ads \a 

SQRLV\ ISAP\DSQ\ VRFSNIR\TNDVAK\TPQMHGTEMANSPQ 

ppplsp\hpcdwdegvtktpstpqsqhfyqmptpdplvpskpm 
edridslsqs fp pq yqeave pt vyvgtavnle ed ean iaw kyyk 
fpkkkdveflppqlpsdkfkddpvgpfgqesvtsvtelmvqckk 

PLKVSDEkVQQ YQIKNQ CLSAIAS DAEQEPKIDP YAFVEGDEBF 
LFFDKKDRQNS EREAGKKKKVEDGTSS VTVLSHEED AMSLFS PS 
IKQDAPRPTSKARPPSTSLIYDSDLAVSYTDLDNLFNSDEDELT 
PGS KRS ANGSDDKAS CKB S KTGNLDPLSCISTADLHKMYPTP PS 
LEQKIMGFS PMNMNNKBYGSMDTTPGGTVLEGNSSSIGAQFKIE 
VDEG FC S PKPS EI KDFS YVYKPENCQ I LVGCSM FAPLKTLPSQ Y 
LPIilKLPEECIYRQSWTVGKLELLSSGPSMPPIiCBGDGSNMDQE 
YGTAYTPQTHTSCGMPPSSAPPSNSGAGIIiPSPSTPRFPTPRTp 
RTPRTPRGAGGPASAQGSVKYENSDLYS PASTPSTCRPLNSVEP 
ATVPS I PEAHSLYVNLILSESVMNLFKDCNSDSCCI CVCNMNIK 
GADVGVYI PDPTQEAQYRCTCG FSAVMNRKFGNNSGLFFEDELD 
I IGRNTDC^KEABKRFEALRATS AEHVNGGLKES BKLSDDLILL 
LO^CTNLPSrFGAADQDPFPKSGVISNWVRVEERDCCNDCYLA 
LEHGRQFMDNMSGGKVDEALVKSSCLH PWSKRNDVSMQCSQDIL 
RMLLS LQPVLQDAI QKKRT VR PWGVQG PLTWQQFHKMAGRGS YG 
TDES P E PLP IPTFLLG YDYDYLVLSP F ALPYWERLMLEP YGS QR 
D IAYWLCPENEALLNGAKS KFRDLTAI YBSCRLGQHRPVSRLL 
TDGIMRVGSTASKKLSKKLVAEWFSOAADGNNEAFSXLKLYAQV 
CRYDLGPYLASLPLDSSLLSQPNLVAPTSQSLITPPQMTNTGNA 
NT PS ATLASAAS S TMTVTS GVAI S TS VATANS TLTT AS TSS S S S 
SNLNSGVSSNKLPSFPPFGSMNSNAAGSMSTQANTVQSGQLGGQ 
QTSAIiO/rAGlSGBSSSLPTQPHPDVSBSTMDRDKVGI PTDGDSH 
AVTYPPAI WYIIDPFTYENTDES TNSSS WTLGI>LR CFLEMVQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A» Alanine, C=Cysteine, D=Aspartic Acid, E° 
Glutamic Acid, F-PhenyX alanine, G=Glycine, 
H«Histidine, I^Isoleucine, K=Lysine, 
L= Leucine, M»Methionine, NsAsparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, V= Valine, 
WoTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /epossible nucleotide deletion, 
\=possible nucleotide insertion) 








TliPPHIKSTVSVQIIPCQYtiliQPVKHEDREIYPQHLKSIiAPSAP 
TQCRRPLPTSTNVKTLTGFGPGIiAWETALRSPDRPECIRLYAPP 
FlLAPVKDKQTELGETFGEAGQKYNVIiFVGYCI^HDQRWrLASC 
TDLYGEIjLBTCIINIDVPNSARRKKSSARXFGLQKLWEWCLGLV 
QMSSLPWRVVIGRLGRIGHGELKDWSCLIjSRRNLQSLSKRLKDM 
CRMCGISAADSPSILSACLVAMEPQGSPV1MPDSVSTGSVFGRS 
TTLNMQTSQLNTPQDTSCTHILVFPTSASVQVASATYTTENLiDL 
APNPNNDGADGMGIFDLLDTGDDLDPDIINILPASPTGSPVHSP 
GSHYPHGGDAGXGQSTDRLLSTBPHEEVPKILQQPIiALGYFVST 
AKAGPI^DWFWSACPQAQYQCPIjFIjKASLHLHVPSVQSDELIjHS 
KHSHPliDSNQTSDVLRFVLEQYNALSWLTCDPATQDRRSCLPIH 
PWLNQLYN FI MNML 


5370 


122S 


716 


RWSRKIiELRRAAQATBSRPPQSQEMHPPTGKEVHALKRLRDSAN 
A>nDVETVQQLLElX3ADPCAADDKGRrALHFASCNGNDQ X VQLLL 
DH GAD PNQRDG LGNT PLHLAA CTNHVPV IT TLLR3GAR VDALDR 
AGRTPLHLAKSKLNILQEGKAQCLKAVR /HGGEADHP YAEGVSG 
APRAT*AARCSGVFPSPSRWLGSAPWSRSSCTIWSLPLHEAKCR 
AVRPLSSAAQGSAPSSSSCCTVSTSLALAESLSLFRACTSLPVG 
GCISWL 


$371 


1331 


167 


IAAMLWKLLLRSQSCRLCSfTuS^^ 

SKENTRTVE KLYKCSVD I RKIRR \ * KDGYF * RMKPMLKKLRI / P 
LQELGADETAVAS XLERCP EAIVCSPTAVNTQRKLWQLVCKNEE 
ELIKLIEQFPESFTTIKDQSNQKLNVQFFQELGLKNVVISRIiLT 
AAPNVFHNPVEKNKQMVRIIOESYLDVGGSEAIMKVWLLKLIjSQ 
NPF I L LNS PTAI KETLEFLQEQGFTS FE I LQLLS KLKGFLFQLC 
PRS IQNSISFS KNAFKCTDHDLKQLVLKCPALLYYSVPVLEERM 
QGLLREXSISIAQIRBTPIflVLELTPQIVQYRIRKLNSSGYRIKDG 
HLANLNGSKKE FEANFGKI QAKKVRPLFNPVAPLNVEB 


5372 


51 


857 


SPGAQFLWAAPDMPDPLFSAVQGKDEILHKALCFCPWLGKGGME 
PLRLLILLFVTELSGAHNTTVFQGVAGQSLQVSCPYDSMKHWGR 
RKAWCRQLGEKG PCQRWS THNLWLLSFLRRWNGSTAI TDDTLG 
GTLTITLRNLQPHDAGLYQCQSLHGSEADTLRKVIjVEVIiADPLD 
HRDAGDLWFPG\DLRASRM?MWSTASPC3ASWKEKSPSHPLPSFS 
SWPASFSSRF*QPAPSGLQPGMDRSO^HIHPVNWTVAMTQGISS 
KLCQG 


5373 ■ 


2814 


346 


VKKTKS I FNS AMQEMBVY VENIRRKFGVFKYS P FRTP YTPNSQY 
QMLLDPTNPS AGTAKI DKQE KVKLNFDMTAS P KI LMS KP VLS GG 
TGRR I SLSDMPRSPMSTNSSVHTGSDVEQDAEKKATSSHFSASE 
ES MD FLDKSTAS PASTKTGQAGSLSGS PKPFS pq lsap ittktd 
KTSTTGS ILNLNLDKSKAEMDLKBLS ESVQQQSTP VPLIS PKRQ 
IRSRPQLNLDO IESCKAQLGINE I S EDVYTAVEHSDSEDS EKS 
DSSDSEYISDDEQKS+GTSQEDTEDKEGCQMDKBPSAVKKKPKP 
TNPVE I KEELKSTS PASE KADPGAVKDKAS PE PEKDFSGKAKPS 
PHPIKDKLKGKDETDSPTVHLGLDSDSE\NELVIDLGEDHSGRE 
GRKNKKEPKEPSPKQDWGKTPPSTTVGSHSPPETPVLTRSSAQ 
TSAAGATATTSTSSTVTVTAPAPAATGS PVKKQRPLLPKE \ TAP 
AVQRS CGTSSTVQQKEI TQS PSTSTITLVTSTQS SPLVTSSGSM 
STLVS S VNGDLP I GTASADVAADIAKYTS KL\ MDAI KGTM\TEI 
YNDLSKN\TTWKAQLAEDSQGLRIEIEKLQWLHQQEL\SEMKHN 
LELTMAEMRQSWEQBRDRLIAEVKKQLEXEKOXJAVDBTKKKQWC 
ANFKKKAI FY CCWNTS YCDYPCQ\ QAHWPBH \MXS CTQSATAPQ 
\QBADAE\ VNTETLNKS S QGSS SSTQS APSETAS A\S KB KETS A 
EKS KESGSTLDLSGSRBTPSS ILLGSNQGSDHSR\SNKSSWS SS 
DEKRGS\TRSDHN/TPSTQHGRSLLPGKESRAGTPFliGTSK 


" 5-374 


2814 


346 


VKKTKS IFNS AMQEMEVYVENIRRKFGVFN YS P FRTPYTPNSQY 
QMLLDPTNPS AGTAKIDKQEKVKLN FDMTASPKILMSKP VLS GG 
TGRRISLSDMPRSPMSTNSSVHTGSDVEQDAEKKATSSHFSASE 
ESMDFLDKSTASPASTKTGQAGSLSGSPKPF3PQLSAPITTKTD 
KTSTTGS I LNLNLDRSKAEMDIiKELSESVQQQSTPVPLIS PKRQ 
IRSRFQLNLDKTIESCKAQLGINE ISEDVYTAVEHSDSEDSEKS 
DSSDSEYISDDEQKS*GTSQEDTEDKEGCQMDKEPSAVKKKPKP 
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to first 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
{A*Alanine, C=Cysteine, DeAepartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H«Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M^Methionine, N=Asparagine , 
PoProline, Q-Glutamine, R=Arginine, 
S»Serine, T=Threonine, V=Valine, 
W°Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=poseible nucleotide insertion) 








TNPVEI KEEUCSTSPAS BKADPGAVKDKAS PEPEKDFSG KAKPS 
PHP I KDKLKGKDETDS PTVHLGLDSDSE \NELVIDLGEDHSGRE 
GRKNKKBPKEPS PKQDWGKTPPSTTVGSHS PPBTPVLTRSSAQ 
TSAAGATATTST3STVTVTAPAPAATGS PVKKQRPLLPKE\TAP 

avqrscgts stvqqke itqspststitlvtstq6splvts sgsm 
stlvss vngdlp igtasadvaadi akytskl\mdaikgtm \te i 
yndl s kn\tt wkaqlaedsqglri e i eklq wlhqqel \ s emkhn 
leltmaemrqsweqerdrlilaevkkqlelekqqavdetkkkqwc 
anfkkeaifyccwntsycdypcq\qahwpeh\mksctqsatapq 
\qeadae\ vntetlnks s qg ss s s tqs aps etas a\ s ke ketsa 
ekskesgstldlsgsretpssillgsnqgsdhsr\snksswsss 
dbkrgs\trsdhn/tpstqhgrsllpgkesragtpflgtsk 


5375 


2907 


1116 


HIFLAEEEPMLERRCRGPLAMGPAQPRLIiSGPSQESP^LGKES 
RGL RQQGT S VA \Q SGAQ AP GRAHR CAHCRRHF P GWVA\ L WLHTR 
RCQA/ RGL PL P CP ECGRR FRFIAP FLALHRQVHAAATPD WG FACH 
LCGQSFRGWVALVLHLRAHSAAKAGPFACPKMARDAFWRRKAAS 
SSIIiRRCHPSRPRGPRPFICGNOGRSlLPTWDQ/UCVAHKRVHV 
SRR P* ERGPP AKVFWG P RPRGP PTGDTPPG PGGDAVDRP F \QCA • 
CCGKRFRHK\ PNLIRSHAACTSGERPHQ/CSRECG\KRFTNKPY 
LTS\HRRITHTARQPYPCKECGRRFRHKPNLLSHSKIHKRSEGS 
AQAAPGPGSPQLPAGPQESAAE PTPAVPLKPAQEPPPGAP PEHP 
QDP IEAP PS Ii YSCDDCGRS FRLER FLRAHQRQHTGERP FTCAEC 
GKNFGKKTHLVAHSRVHSGERPPRLARKCGRRFLPRASQSGGRN 
SABPNAPRFGP FVCPDCG KAFRHKP YIiAAHRP I ATPAEKP YVCP 
DCRKAFSQKSNL\VSHRRIHTGERPYACPDCDRSFSQKSNLITH 
RKS HI RDGAFCCAICGQTFDDEERLIiAHQKKHD V 


S376 


4504 


591 

V 


VST FS LCLWP AGGGGRGR VSNMAQS KRHVYSRTPSGSRMSAEAS 
ARPLRVGSRVE VI GKGHRGT VAY VGATLFATGKWVGVI LDEAKG 
KNDGTVQGRKYFTCDEGHGI FVRQS Q I Q V FEDG ADTTS PBTPDS 
SASKVLKREGTDTTAKTSKLRGLKPKJCAPTARKTTTRRPKPTRP 
ASTGVAGASSSLGPSGSASAGELSSSEPSTPAQTPLAAPIIPTP 
VLTSPGAVPPIaPSPSKEEEGLRAQVRDljEEKLETLRLKRAEDKA 
KLKELEKHKIQLEQVQEWKSKMQEQOADLQRRIjKEARKEAKEAI^ 
EAKERYMEEMADTADAIEMATLDKEMAEERAESLQQEVEALKER 
VDELTTDLEILKAEIBEKGSDGAASSYQLKQLEEQNARLKDALV 
RMRDLS S S EKQEHVK\ LQKLMEKKNQELEWRQQRERLQEELS Q 
AESTI DELKEQVDAAiGAEEMVEMLTDRNLNLEEKVRELRETVG 
DLSAMNEMNDELQENARETELELREQLDMAGARVRBAQKRVEAA 
QETVADYQQTIKKYRQLTAHLQDVNRELTNQQEASVERQQQPPP 
ETFD FKI KFAETKAHAKAJEMELRQME VAQANRHMSLLTAFMPD 
SFLRPGGDHDCVLVLLLMPRLICKAELIRKQAQEXFELSENCSE 
RPGLRGAAGEQLSFAAIGLVY\SLMPAAGHRYHRY*CHALSQCK 
LD\VYKKVGSLYPEMSAHERSLDFliIELLHKDQLDETVNVEPI*T 
KAI KYYQHLYS I HLAEQPEDCTMQLADH I KFTQS ALDCWS VEVG 
RLRAFLQGGQEATD IALLLRD LETS CS \DIRQFCKKIRRRMPGT 
DAPGIPAALAFGPQVSDTLLDCRKHLTWWAVLQEVAAAAAQLI 
APLAENEGLLVAAIiEELAFKASEQIYGTPSSSPYECLRQSCNIL 
ISTMN K\ LVTAMQEGBYDAER P PSKP PP \ VELRAAALRAE I TDA 
EGIXSLKiEDRETVIKELKKSLKIKGEEIjSEANVRliTLLEKKLDS 
AAKDADERI E KVQTRLEETQALLRKKE KEFEETMDALQAD IDQL 
EAEKAEL KQRLNSQS KRTI EG LRGP PPSG IATL VSG IAGEEQQ R 
GMPGQAPGSVPGPGLVKDSPLLLQQISAMRLHISQLQHENSIL 

LNQLSTHTHNA^DI TRTSPAAKS PSAQIjMEQVAQLKS LSDTVE KL 
KDE^KETVSQRPGATVPTDFATFPSSAFTjRAKEEQQDDTVYMG 
KVTFS CAAG FGQRHRL VLTQEQLHQLHS RLI S 


5377 ; 


762 


1106 


DVPC KRVLPAEAQE KGQLTLS CGESGEEG \ F * YHEVRQAEG ES * 
/WFGPNVRLVHTQLKTKKPSGTLKAKFYLPfTGSTKFAARISCTK 
SS*WPGYDGWWGGQYI FIFRGMRWEEQP 


5378 


2009 


664 


QASGTTLRPLPDLPQLKRP^TSRNRALKPRGRLVLMTSCLPAL 
RFIATPRIiSAMPHIDNTDVKLDFKDVLLRPKRSTIiKSRSBVDLTR 
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Amino acid segment containing signal peptide 

(A= Alanine . C=Cvsteine DsAsnarrir &r»4H Ra 

Glutamic Acid, F= Phenyl alanine. G=Glycine, 
H=Histidine, Ialsoleucine,' K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S^Serine, T»Threonine, V-Valine, 
W«Tryptophan, Y=Tyrosine, X»Unknowr., *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






( 


SFSFRNSXQTYSGVPI IAANMDT VGTFEMAKVLCKS * VPGSFWD 
VPQMGCVFLIYKLFTLKWKMLLLSVLLPAS ILVAEKFSLFTAVH 
KH YSLVQWQE FAGQN PDCLBHLAASSGTGS S DFEQLEQ I LEA I P 
Q VKYTCLDVANGYS EHF VEFVKD V3UCR FPQHTI MAGNWTGEMV 
BBLILSGADI IKVG IG PGSVCTTRKKTGVGYPQLSAVMECADAA 
HGLKGHI I SDGGCS CPGDVAKAFGAGADFVMLGGMLAGHSESGG 
ELIERDGKKYKLFYGMSS*I\AM\KKYAGGVAEYRASEGKTVEV 
PFKGDVEHTIRDILGGIRSTCTYVGAAKLKHLSRRTTFIRVTQQ 
VNPIFSEAC 


5379 


2009 


664 


QASGTTLRP LPDLPQLKRRRATS RNRALKPRGRLVLMTS CLPAL 
R FI ATPRLSAMPHI DNDVTCt* DFKD VLLR PKRSTLKS R5 EVDLTR 

VPQMGCVPLIYKLFTLKWKMLLLSVLLPAS ILVAEKFSLFTAVH 
KHYSLVQWQEFAGQNPDCLBHLAASSGTGSSDFEQLEQILEAI P 
Q VK YI CLOVANGYS EHFVKFVKDVRKRFPQHTIMAGNVVTGEMV 
EEL ILSGADI I FCVGIGPGSVCTTRKJCTGVGYPQLSAVMECADAA 
HGLKGHIISDGGCS CPGDVAKAFGAGADFVMLGGMLAGHSESGG 
EL IERDG KKYKLFYGMS S * I\AM\KKYAGGVAE YRASEGKTVEV 
PFKGDVEHTIRDILGGIRSTCTYVGAAKLKBLSRRTTFIRVTQQ 
VNPIFSEAC 


5380 


2 


2050 


PS RAGGAERGRAAAARS PGGS AAGWECPSVLDEAGACTMSSCVS 
SQPS SNRAAPQDELGGRGSSSSESQKPCEAIiRGLSSLS IHLGME 
SFlVVTECEPGC»VDLfiIJUU)RPLEAIX^EVPLDTSGSQARPHL 
SGRKLSLQERSQGGLAAGGSLDMNGRCICPSLPYSPVSSPQSSP 
RLPRRPTVESHHVS ITGMQDCVQLNQYTLKDEIGKGSYGWKLA 
YNENDNTYYAMKVLSKIGCLIRQAAFPRRPPPRGTRPAPGGCIQP 
RGPI\EQVYQEIA\ILKKLDHPNVV\KLVEVL\DDPNEDHLYMV 

H\RD I KPSNLLVGEDGHIKIADFG VSNEFKGSDALLSNTVGrPA 
rKAPESLSETRKIFSGKALDVWAMGVTLYCFVFG*CPFMDERIM 
CLHS KI KSQALE FPDQPD I AEDLKDLITRMLDKNPESR I WPE I 
KI>HPWVTRHGASPLPSEDSNCTLVEVTBEEVENSVKHIPSLATV 
ILVKTMIRKRSFGNPFEGSRRBERSLSAPGNLLTKKPTRECESL 
SELKT*KIS PLPACCKVT* §FPHPSGCRPSCWQPPFLHTHSQPR 
*PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPIjPFPLST3WL 
PDLVGAPGSHFCFLNIALLRYNSHTM 


5381 


2 


2050 


PSRAGGAE RGRAAAAR S PGGSAAGWECPS VLDEAGACTMSS CVS 
SQ PS SNRAAPQDE LGGRG S SSS E S Q KP CEALRGLSSLS 1 HLGM E 
SFIVVTECEPGCAVDLGIiARDRPLEADGQEVPLDTSGSQARPHL 
6GRKLSLQERSQGGLAAGGSLDMNGRCICPSLPYSPVSSPQSSP 
RLPRRPTVESHHVS ITGMQDCVQLNQYTLKDEIGKGSYGWKIiA 
YNENDNTYYAMKVLSKKKLIRQAAFPRRPPPRGTRPAPGGCIQP 
RGPl\EQVYQEtA\lLKKLDHPNW\KLVEVL\DDPNEDHLYMV 
F\ ELVNOGP VMEVPTLICPLiSEDO P VTJfinTi T VG T Br VT.H vn ffT T 
H\ RD I KPSN LLVGEDGH I KI AD FGVSNE FKGS DALLSNTVGTP A 
FMAPESLSETRKIFSGKALDVWAMGVTLYCFVFG*CPFMDERIM 
CLHSKIKSQALEFPDQPDrAEDLKDLITRMLDKNPESRIWPEI 
KLHPWVTRHGAE PLPSEDENCTLVEVTEEE VENSVKH I PSLATV 
ILVKTMIRKRS FGNPFEGSRREERSLSAPGNLLTKKPTRECBSL 
SEIiKT*KlSPLPACCKVT*EFPHPSGCRPSCWQPPFLHTHSQPR 
* PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPLPFPLSTS WL 
PDLVGAPGSHFCFXiNIALIiRYNSHTM . 


5382 


1536 


203 


GARGS QQDAPALQE ABVRGPERAQ PARGRMTKARL FRLW LVLGS " 
VFMILLirVYWDSAGAAHFYLHTSFSRPHTGPPLPTPGPDRDRE 
LTJUOSDVDEFLDKFLSAGVKQSDLPRKET^QPPAPGSMEESVRJG 
YDWSPRDARRSPDQGRQQAERRSVLRGFCANSSLAFPTKERPFD 
DI PNS E LSHLI VDDRHGAI YCYVPKVACTNWKRVM I VLSGS LLH 
RGAPYRDPLRIPREHVHNASAHLTFNKFHRRYGKLSRHLMKVKL 
KKYTKFLFVRDPFVRLISAFRSKFELENEEF/^PQVRRAHAAAV 
RQPHQ P ARLGARGLPR WPQ \ VS FANF I Q YLLD PHT3 KLAP FNEH 
WRQ VYR LCHP CQ I D YD FVG KLETLDEDAAQLLQLLQ VD LAAPL P 
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to first 
amino acid 
residue of 
amino acid 
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1 Predicted end 
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location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 

Glutamic Acid, F=Phenylalaxine, G=Glycine, 
H=Hlstidine, Ialsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine , 
P=Proline, D=Glutamine, R^Arginine, 
SsSerine, T=Threonine, V«Valine, 
W=Tryptophan, Y«Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PELPGTGP PSS WBEDWFAKI PliAWRQQLYXLYEADF VLFG YPKP 
KNLLRD 


5383 


45 


5250 


VERLLGC RNS KRTWRML 1 S KNM P WRR&QG I S FGMYSAEELKKLS 
VKSITNPRYLD3LGNPSANGLYDLALGPAD3KEVCSTCVQDFSN 
CSGHLGHIELPLWYNPLLFDKLYLLLRGSCliWCHMLTCPRAVI 
HLLLCQLRVLEVGALQAVYELERIIiSRFLEBNADPSASEIREEL 
EQYTTEIVQNNLIX3SQGAHVKNVCESKSKLIALFWKAHMNA 
PHCKTGRSVVRKEHNSKLTITPPAMVHRTAGQKDSEPLGIEEAQ 
IGKRGYLTPTSAREHLSAIiWKNEGFFliNYLFSGMDDDGMESRFN 
PS VFFLDFL WP PSRS RP VSRLGDQMFTNGQTVNLQAVM KDWL 
IRKLLALMAQEQKLPEEVATPTTDEEKDSLIAIDRSPLSTLPGQ 
SLIDKLYNIWIRLQSHVNIVFD3EMDKLMMDKYPGIRQILEKKE 
GLFRKHMMGKRVDYAARSVICPDMYINTNBIGI PMVFATKLTYP 
QPVTPWNVQELRQAVINGPNVHPGASMVINEDGSRTALSAVDMT 
QRE AVAKQLLTPATGAP KPQGTK I VCRHVKNGD I LLLNRQ PT LH 
RPS IQAHRAR ILPEEKVLRLHYANCKAYNADPDGDEMNAHFPQS 
EI^RAEAYVIJlCrD0X3YLVPKDGQPIiAGLlQDHMVSGASMTTRG 
CFFTREHYMELVYRGLTD KVGR VKLLS PS I LXPFP LWTGKQ WS 
TLLINII PEDHI PLNLSGKAKITGKAWVKBTPRS VPGFNPDSMC 
ESQVI IREGELLCGVLDKAHYGS S AYGLVHCC YE I YGGETSGKV 
LTCLARLFTAYLQLYRGFTL3VEDI LVKPKADVKRQRI I EESTH 
CGPQAVRAALNLPEAASYD3VRGKHQDAHLGKDQRDFNMI DLKF 
KE EVNH YSNE I NKACMP FGLHRQFPENTLQLMVQSGAKGSTVNT 
MOISCIiIjGQIELEGRSTPZjMASGKSJLPCFEPYEFTPRAGGFVTG 
RFLTG I KPPE FFFHCMAGREGIjVDTAVKTSRSG YLQRC I 1 KHLE 

glwqydltvrdsdgswqflygedgldipktqflqpkqfpfla 

SNYE VI MKSQHLHE VIjSRAD P KKALHHFRAI KKWQ S KHPNTLIiR 

rgafi^ysqkiqeavkalklesenrngr/rpwds/g/rmlrmwy 

EIjDEESRRKYQKKAAACPDPSLSVWRPDIYFASVSETFETKVDD 

ysqbwaaqteksyekselsldrlrtllql\kwqrslcepgeavg 
llaaqs i geps tqmtlntfhfagrgemnvtlg i prlre i lmvas 
aniktpmmsvpvlntkkaljcrvkslkkqltrva^evlqkidvq 
esfcmeekqnkfqvyqlrfoflphayyqqekclrpedilrfmet 

c Aiuwiao xiuwAMCw^wMUrJuM VM 1 KRATQRuIjDNAGELGRSRG 

eqbgdeeeeghlvdabaeeadadasdakrkekqeeevdyeseee 
eeregeenddedmqeernphregarktqeqdeevgl/gh*ggpv 
psrppdaapethpqpgapga\eamerrvqavreihpfiduyqyd 
teeslw cq vtvklplmkinfdmsslws lahgavi yatkg i tr c 
liajett1wknekelvlnteginlpelfkyaevldlrrlysndih 

iWAiwwuav ruuv xf Av X^XAvlJJrKHiwi*VADYMCF 
EGVYKPI^FGIRSNSSPLQQMTFETSFQFLKQATMLGSHDBLR 
S PSACLWGKWRGGTGLPELKQPLR 


53 84 T 


196 


886 


QSCGQRLPT VL* L*GPPGS CPC 1LSLF \ PGRPHALPEt RPY IN I 
T I LKGDKGD PGPMGLPGYMGREC! port? nrz'Drfz Qvrnvr two dp 

APCQKRFFAFSVGRKTALKSGEDFQTLLFERVFVNLDGCFDMAT 
GQFAAPLRGIYFFSIiNVHS WNYKETYVH IMHNQKEAVIL YAQPS 

EI^ IMQSQS VMLDLAYGDRVWVRLFKRQRENAI YSNDFDTYI TF 
SGHLIKAEDD 


■ 5385 


326 


799 


IjMVPRTKKEAPAPPKAEAKAKAL \kakkavlkdveshkknkihm " 
S PTFRRPKTL+LRRQPKYF WKSTPRRNKLDHHVII KFPLTTE * A 
VXXI ENNStiLVFT VDVKANKHQIKQAVKK/ LCDID VA K VNTL J Q 
SDGERKAYVRIjAPDYBALVVATKIGIT 


5386 


326 " " 


799 


IiMVPRTKKEAPAPPKAEAKAKAL\KAKKAVLKDVHSHKKNKIHNl 
SPTFRRPKTL*LRRQPKYPWKSTPRRNKLDHHVIIKFPLTTE*A 
VKKI ENNSLLVFTVDVKANKHQ I KQAVKK/IjCDID VAKVNTL IQ 

sdgerkayvriapdydalwatkigit 


5367 r 


2 


2117 


fvvaasggcwfvlgerragsllsasygtfampgmvlfgrrwaia 
sddlvfpgffelwrvlwwigiltlylmhrgkldcaggallssy 
livlmillawictvsaimcvsmrgticnpgprksmskllyirl 

ALFFPEM VWASLGAAWVADGVQCDRIVVNGII ATWVSWI I IAA 
r WS 1 1 1 VFDPLGGKMAP YSSAG PSHLDSHDS SQLLNGLKTAAT 
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ID 

NO: 
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corresponding 
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amino acid 
residue of 
amino acid 
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Predicted end 

location 
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amino acid 
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sequence 


Amino acid segment containing signal peptide 
(AaAlanine, C=Cysteine, DsAspartic Acid, E*» 
Glutamic Acid, F» Phenyl alanine, G-Glycine, 
H=Histidine, I«Isoleucine, K*Lysine, 
LoLeucine, W=Methionine. N=Asparagine, 
P- Proline, Q»Glut amine, R«Arginine, 
S=Serine, T=Threonine, V= Valine, 
w=Tryptophan, Y=Tyrosine, X=Unknown, **stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SVWETRIKLLCCCIGKDDHTRVAPSSTAELFSTYFSZ)TDLVPSD~ 
IAAGLALLHQQQDNIRNNQSPAQWCHAPGSSQEADLDAEIjKNC 
HHYMQPAAAAYGWPLYI YRNPLTGIiCRIGGDCCRSKNPQTMT /M 
\\j\aLf\iij\iU/ oaf x uti 1 HRftHVQGLnPRQLPWTRrTELPFLVA 
LDHRKESVVVAVRGTMSLQDVLTDLSAESSVLDVECEVQDRLAH 
KG I SQAARYVYQRLINDG ILSQAFS 1APEYRLVI VGHSLGGGAA 
ALLATMVRAAYPQVR CYAFS PPRGLWSKALQEYSQSFIVS LVLG 
KI>VIPRLSVTNLEDLKRRILRWAHCNK^KYKILI^LWYBL,FG 
GNPNNLPTBLDGGDQEVLTQPLLGEQSLI»TRWSPAYSFSSDS PL 
DSS P JCYPPLYPPGRI IHIjQEEGASGRFGCCSAAHYSAKWSHEAE 
r s A JL u i(j f aJIIj i DHM PD I IjMRAIjDS WS DRAACV 5 CP AQG V S S V 
DVA 


5388 


1569 


753 


TADGG AG GGGRRQ AG VRRH YL YP Fl'GG YRRRRAACQAE R PAARS - " 

kdtdlaayqkgnlgvqlrnmaqetnhsovpmlcstgcgfygnpr 
tngmcsvcykehlqrqnssngrisppvqctdgsvpbaqsaldst 
sssmqpspvsnqsllsbsvassqldstsvdkavpetedvqasvs 

DTAQQPSEEQS KSLB\NRNKKRIAVSCAGRKWDLIiGLNAGVBMF 
1 vvi rvTQMiTIAbTITKQMLKNFVyQQEFKSFGSFHQQLLEYK 
ILEHLQTKN 


5369 


1569 


753 


TADGGAGGGGRkQAGURR^V£iVPFTGGYRRRRAA(^AJERPAA^^ 

KDTDLAAYQKGraiGVQLRNMAQBTNR^QVPMI^STGCGFYGNPR 

TNGMCSVCYKBHLQRQNSSNGRISPPVQCTDGSVP2AQSALDST 

SSSMQPSPVSNQSIiLSESVASSQLDSTSVDKAVPETEDVQASVS 

DTA0X3PSElEQSKSLE\NRNKKRXAVSCAGRKWDlilaGLNAGVEMF 

TVVYTVTQMYT1ALTITKQMLKNFVFQQEFKSFGS FHQQLLEYK 

ILEHLQTKN 


3 J y U 




1332 


EDPRKLMEDKMWSECEGP EMSLVCMDFQAHAREQLSKStfRb F I 
EGGADDSITRDDNTAAFKRI RLRPRYLRDVSEVDTRTTIQGEEI 
SAP I CIAPTGPHCLVWPDGEMS TARAAQAA\GI C YI TSTFAS CS 
LEDIVIAAPEGIJIWFQLYVHPDLQIiNKQLlQRVESLGFkAIiV'IT 
LDTP VCGNRRHDIRNQLR RNLTIiTDLQS PJCKGNAI P YFQMTP I S 
TSLCWNDLSWFQS ITRLP I ILKGII.TKEDAELAVKHNVQGI IVS 
NHGGRQLDEVIiAS IDAIiTEVVAAVKGKIEVYLDGGVRTGNDVIiK 
ALALGAKCI FLGDAI LWATiAS KGEHGVKEVLNI LTNEFHTSMA\ 
L TGCRSVAE INRBTL VQFSRL 


5391" 


X 


1292 


VKKAAGRSRGPPTAGGQRCEEAPGTVl^RRLGVRAWVKBNRGS F " 

QPPVCNKl^IQEQLKVMFVGGPNTRKDYHIESGEEVFYQLEGDM 

VLRVLEQGKHRDWIRQGE I FLLPAR VPHSPQRFANTVGLWER 

RRLETELDGLRYYVGDTMDVLFEKWFYCKDLGTQLAPI IQEFFS 

SEQYRTGKPIPDQLLKBPPFPLSTRSIMEPMSIiDAWLDSHHREL 

QAGTPLSLFGDTYETQVIAYGQGSSEGLRQNVDVWLWOLEGSSV 

Vm3GRRLSLGPWMDSLLVLSWGPSY\AW\ERTQGSVALSVT\Q 

DPACKKS PWGEPSCHGLKAATGVPSTLEVPSLPKNSPS PHYLSV 

YCRCVPHRPAHCCHPPSCPSQPRCHAPGRAAAPHLLWQTQPTAL 

PVLPGGLPPAPLLPIPLSLQTQCSTSTPRRPSIKAS 


5392 


1 


1623 


IRGSNAQKWGASGSGGAGPQPDPAGPGGVPALAAAVLGACEPR ~ 
CAAP C PL P ALSR CRG AGSRGSRGGRG AAG SGDAAAAAE W I RKG S 
FXKKPAHGWIJIPDARVLGPGVSY\A^YMGCIEVIJ^MRSLDFirT 
RlX}VTREAlNRlJIEAVPGVRGSWKKKAPlfKALA^ 
«J*io ± a ±n idi lAa ucm JjS V FA1 RQ V I ANHHM PS I S FASGGDTDMTD 
YVAYVAKDPINQRACHI LECCEGIAAQS I ISTVGQAFELRFKQ Y 
LHSPPKVALPPBRIAGPEESAWGDEEDSLEHNYYNS IPGKEPPL 
GG L VDS RLAXjTQ PCALTALDQG PS PS L RD ACS L P WD VG S TGTAP 
PGDGY VQADARGPPDHE EHLYVNTQGLDAPEPEDS PKKDLFDMR 
P FE DALKLHECS VAAGVTAAPLPLEDOVJP S PPTRRAP VAPTEEQ 

lrqepwyhgrmsrraaermlradgdflvrdsvtnpgqyvltgmh 

ACQ PKHLLLVDPEGWRTKDVLFES ISHL IDHHLQNGQ P X VAAE 
S EltHLRG WSRE P 


5393 


2 


982 


GGDSAG^5TMB^C>^SQNVCPRNLWlJJaPLrVLLIJ^ADSQAAAP 
PKAVLKLEPPWINVLQ\EDSVTLTCQGAPQP/ERSDSIQWFHNG 
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SEQ 
ID 
HO: 


Predicted 

ucy xiiii xiiy 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

cUliinv aLiu 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, DsAspartic Acid, Ea 
Glutamic Acid, F« Phenyl alanine, G -Glycine, 
H-Histidine, I-Isoleucine, K=Lysine, 
L-Leucine, M«Methionine, N*Asparagine , 
P»Proline, Q=Glutamine, R=Argiaine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








\NLIPTHTQPS\YRPKANNN\MGEYTCQTGQTSL\SDPVHLTV 
LS EWLVLQT PHLEFQEGET I MLRCHS \WRDKP\LVKVTFFQHGK 
SQKFSHLDPTFS I PQANHSHSGDYHCTGNIGYTLFSSKFVTITV 
QVPSMGSSS PMG I IVAWIATAVAAIVAAWALI YCRKKRISAN 
STDP VKAAQ FE P PGRQM IA I RXRQLEETNN DYETADGGYMTLNP 
RAPTDDDKN I YLTLPPNDHVNSNN 


S394 


2 


982 


GGDSAGOTMETQMSO^VCPRNLWLLQPLTVIJiLLASAD 
PKAVIJCI£PPWXNVI^\EDS\nPLTCO^PQP/ERSDSIQWFHNG 
\NLIPTHTQPS \YRFKANNN\DSGEYTCQTGQTSIi\SDPVHLTV 
LSEWLVLQTPHLEFQEGETIMLRCHS\WRDKP\LVKVTFFQNGK 
SQKFSHIJDPTFSIPQANHSHSGDYHCTGNIGYTLFSSKPVTITV 
0 VP 55MGS SS PMGI I VAW IATAVAAIVAAVVALI YCRKKRISAN 
STDPVKAAQFEPPGRQMI AI RKRQLEETNNDYBTADGGYMTLNP 
RAPTDDDKN I YLTLPPNDHVNSNN 


53 95 


3135 


531 


RASDAKNQEGIiLNTRRKSTDSVPISKSTLSRSLS^QASDFDGAS 
S SGN PEAVALAPDAYSTGSS SASSTIiKRTKKPRP PSLKKKQTTK 
KPTETPPVKBTQQEPDBBSLVPSGENLASBTKTESAKTEGPS PA 
LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 
RKTL PLTTAPEAGEVTPS DS GGQEDS PAKGHS VRLEFD YS EDKS 
SWDNQQENPPPTKK1GKKPVAKMPLRRPKMKKTPEKLDNTPASP 
PRSPAEPNDIPIAKGTYTFDIDKWDDPNFNPFSSTSKMQESPKL 
PQQS YNFDPDTCDESVDPFKTSSKTPSSPSKSPAS FEI PASAME 
ANGVIX3DGLNKPAKKKICrPLKTDTFRVKKSPKRSPIiSDPPSQDP 
TPAATP ETPP VI SAWHATD E EKLAVTKQKWTCMTVDLEADKQD 
YPQ PSDLSTFVNETKFSS PTEELDYRNS YEI EYME KIGS S LPQ D 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP 
SEA IEITAPEGS FASADALLSRLAHP VS LCGALD YLEPDLAEKN 
PPLFAQKLQREAAH PTDVS I S KTAL YSR IGTAEVEKPAGLLFQQ 
PDLDSALQIARABI ITKEREVSEWKDKYEESRREVMEMRXIVAE 
YEKTlAQMIEDEQREKSVS\HQTVQQLVLEKEQA\liADLNSVEK 
\SLADLFRJIYEKM1G3VLEGFRKNEBVLKRCAQEYLSRVKKEEQR 
YQALKVHA\EEKLDRANAE\ IAQVRQKAQQEQAAHQASLAERSS 
CRV\ DALE RTLE Q KNKEI H ELTKI CDEL I AKMGKS 


5396 


3135 


531 


RASDAKNQEGLI^TRRKSTDSVPiSKSTLSRSLSLQASD^DGAS 
SSGNPEAVALAPDAYSTGSSSASSTLKRTKKPRPPSLKKKQTTK 
KPTETPP VKETQQE PDEES LVPSGENLASETKTBSAKTEGPS PA 
LLEETPLEPAAG PKAACPLDSES VEGWP PASGGGRVQNS PPVG 
RKTLPLTTAPEAGEVTPSDSGG QEDS PAXGHSVRLEFDYSEDKS 
SWDNQQENPPPTKKIGKXPVAKMPLRRPKMKKTPEKLDNTPAS ? 
PRSPAEPNDIP IAKGTYTFD iDKWDDPNFNPFSSTSKMQESPKL 
PQQS YN FD PDTCDE S VDPFXTSS KTPS SPS KS PAS FE IPAS AME 
ANGVIX3DGLNKPAKKKKTPLKTDTFRVKKSPKR5PLSDPPSQDP 
TPAATPETPPVI S AWHATDEE KIAVTNQKWTC^TTVDLEADKQD 
YPQPSDLSTFVNETKFSSPTEBLDYRNSYEIEYMBKIGSSLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
AL VNTAAKNQHP VP RGLAPNQE SHLQVPE KS SQ KELEAMGLG T P 
SEAIEITAPEGS FASADALLSRLAHPVSLGGALDYLEPDIiABKN 
PPLFAQKLQREAAHPTDVS IS KTALYSR I GTAEVEKPAGLLFQQ 
PDLDSALQIARAEI ITKERBVSEWKDKYEESRREVMEMRKIVAE 
YEKTIAQMIEDEQREKSVS\HQTVQQLVLEKEQA\LADLNSVEK 
\ S LADL FRRYEKMKE VLEGFRKNEE VL KRGAQE YLSR VKKE EQ R 
YOALKVHAV EEKLDRANAE \ T AOVPfiTT AOOPD AAUO A «5 r.ttpo e e 
CRV\DALERTLEQKNKEIEELTKICDELIAKMGKS 


5397 


3135 


531 


RAS DAKNQEGLLNTRRXSTDS VP I S KS TLS RS LS LQAS D FDGAS " 

SSGNPEAVALAPDAYSTGSSSASSTLKRTKKPRPPSLXKKQTTK 

KPTETrPPVKETOXJEPDEESLVPSGENIASETKTESAKTEGPSPA 

LLEET PLEPAAGPKAACPLDS ESVEGWPPASGGGRVQNSPPVG 

RXTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 

SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPASP 

PRS PAEPND I P IAKGTYTFD I DKWDDPNFNP FSSTS KMQESPKL 
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SEQ 
ID 

NO: 


Predicted 
beginning* 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
irt-zuanine, ueuysceme, D»Aapartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M»Methionine, N=Asparagine, 
P=Proline, Q-Glutamine, R*Arginine, 
SaSerine, T=Threonine, V^Valine, 
W=Tryptophan, Y^Tyrosine, X=Dnknown, *s=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PQQSyNFDPDTCDKSVDPFKTS^kTPSSPSK^PASFEIPASAME""' 

ANGVDGIX3LNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 

TPAATPBTPPVI SA\A/HATDEEKLAVTNQKVfTCMTVDIiEADKQD 

YPQPSDLSTFVNETKFSSPTBEIiDYRNSYEIEYMEICIGSSLPQD 

uut\e iuvuaJjX JjJnr in SQcS P VK5SP VRMSESPTPCSGSSFEETE 

ALVNTAAKNQHPVPRGEiAPNQESHLQVPEKSSQKELEAMGLGTP 

SEAIE ITAPEXSSFASADALLSRLAHPVSLCGALDYLEPDliAEXN 

PPLFAQKLQRHAAHPTDVSISKTALYSRIGTAEVEKPAGLLFQQ 

PDLDSALQIARAEIITKEREVSEWKDKYEESRREVMEMEIKIVAE 

YEKTIAQMIBDEQREKSVS\HQTVQQLVLEKEQA\IiADLNSVEK 

\SLADLFRRYEKMKEVLEGFRKNEEVLKRCAQBYLSRVKKEEQR 

YQAI»KVHA\ EEKLDRANAE \ IAQ VRGKAQQBQAAHQASLAERSS 

CRV\DALERTLBQKNKEIEELTKICDELIAKMGKS 


5398 


56 


5426 


SG E VCRM ESNFNQEG V PR PS YVFSADP IARPSE I NFDG I KLDLS 
HEFSLVAPNTEANSFESKDYIiQVCLRIRPFTQSBKELESEGCVH 
ILDSQTWLKEPQCILGRLSBKSSG\QM\AQKPSFFPGFLGPAT 
TQKEFFQGCIMHP\VKDLLKGQSRLIFTYGLTNSGKTYTFQGTE 
ENI RI LPRTLNVLFDSLQERL YTKMNLKPHRSR B YLRLS SBQEK 
BEIASKSALLRQIKBVTVHNDSDDTLYGSLTNSLNISEFBESIK 
D Y EQANLNMANS I XFS VWVS FFE I YNE YI YDLFV PVSS KFQ KRK 
MLRLS QDVKG YS FI KDLQW IQVSDS KBAYRLLKLGI KHQSVAFT 
KLNNASS RSHS I FTVKI LQ I EDSEMSR VIRVSELSLCDIiAGSBR 
TMKrQNEGERLRETGNIiJTSLLTLGKCINVLKNSEKS KFQQHVP 
FRESKLTHYF/QSFFNGKGKICMIVNI SQCYLAYDETLNVLKFS 
AXAQKVCVPDTI^SSQEKIFGPVKSSQDVSLDSNSNSKILNVKR 
ATISWEN3LEDLMEDEDLVEBLENABETED/VGBTKLLDEDIJ5K 
TbEENKAFISHEEKRKIiLDIilEDLKKKLINEKKEKIiTIiEFKIRE 
EVTQEFTO YWAQREADFKETLLQEREI LEENAERRLAI FKDLVG 
KCDTREEAAKDICATKVETEEATACLELKFNQIKAELAKTKGEL 
IKTTCEEIJCJa^ESDSLIQEI»ETSNKKIITQNQRlKELINII0Q 
KEDTINEFQNLKSHMENTF^CNDKADTSSIiIINNKLICNETVEV 
PKDSKSKI CS3RKRVNENELQQDEPPAKKGS IHVSSAITEDQKK 
SEEVRPNJAE IEDIRVLQENNEGLRAFLLTIENELKNEKEEKAE 
"LNKQIVHFQQSLiSLSEKKNLTLSKEVQQ'lQSNYDlAIAELiHVQK 
SKNQEQEEKIMKLSNEIETATR5ITNNVSQIKLMHTK1DEIiRTL 
VS VS Q I SN I DL LNLRDhSXGS EEDNLPNTQ LDbLGND YLVSKQV 
KEYRIQEPNRENSFHSSISAIMEECKfilVKASSKKSHQIEELEQ 
QIEKLQAEVKGYKDENNRLKEKEHKNQDDLLKEKETLIQQLKEE 
LQEKNVTLDVQ IQHWEGKRALS ELTQGVTCYKAKI KELETILE 
TQKVERSH S AKLEQD I LEKES 1 1 LKLERNLKEFQEHLQDS VKNT 
KDLNVKELKLKEEITQLTNNLQDMKHLLQLKEEEEETNRQETEK 
LKEELSASSARTON\LNADLQRiCEEBYAI)LKBKLTDAKKQIKQV 
QKEVS VMRDEDKLLRI KINELEKKkNOCSQELDMKQR\TIQQLK 
EQLINQKVEEAIQQYERACKDLNVKEKIIEDMRWl'LEEQEQTQV 
EQDQVh \ EAKLSEVERLATELDR WR VXCNDLETKNNQRS NKEHE 
tan l l» vi/jisjj l w L»yUi5L»y iss fc.yKYNADRKKWLEEKMMLITQAKEA 
ENIRNKEMKKYAEDRERFFKQQNEMEILTAQLTEKDSDLQKWRE 
ERDQLVAAIiE IQ LKAL I S SNVQKDNE I BQLKRI I SETS KIETQ I 
MDIKPKRISSADPDKLQTEPLSTSPEISRNKIEDGSWLD5CEV 
oiiWUyoiKr fJVf CiJjBAy* IirIjQF«&JWVKHPGCTEPVTVKIPK 
ARKRKSNEMEEDLVKCENfCKNATPRTNLKFPISDDRNSSVKKEQ 
KVAIRPSSKKTYSLRSQASIIGVNLATKKKEGTIiQKFGDFLQHS 
PSILQSKAKKIIETMSSSKLSNVEASKENVSQPKRAKRKLYrSB 
ISSPID1SGQVILMDQKMKESDHQI IKRRLRTKTAK 


5399 


705 


230 


GPRMAKFLSQDQINEYKECFSLYDKQQRGKIKATDLWVAMRCLG 
AS P TPGEVQRHLQTHG I DGNGELDFSTFLT IMHMQI KQEDPKKE 
ILLAMLMVDKEKKGYVMASDLRSKLTS ZJSEKLTHKEV\ DDLFRE 
\ADIEPNGKVKYDEFIHKITSYLDGTY 


5400 


931 


248 


SHCSSGMEIPPTNYPASRAAbVAQNYINYQQGTPHRVFEVQKVK 
QASMEDI PGRGHKYRUCFAVEE 1 1 QKQVKVNCTA3 VLYPS TGQE 
TAPEVHFTFEGETGKNPDEEDNTFYQRLKSMKEPLEAQNI\PDN 
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SEQ 
ID 
NO: 


Predicted 

bRoinnino 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucieociae 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


.ftmino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F= Phenyl alanine, G=Glycine f 
HfeHistidine, I=»Isoleucine, K=Lysine, 

LaTiPUcine M=:Mi=»hVii nm' np TJ-Acm^v^rri na 

P»Proline, Q=Glutaraine, R=Arginine, 
SsSerine, T«Threonine, V-Valine, 
WeTryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\apossible nucleotide insertion) 








FGNVSPEMTLVLHZiAWACGYIIWQNSTEDTWYKMVTCtQTVKQV 
QRNDDFI ELDYTI LLHNIASQEI IPWQMQVLWHPQYGTKVKHNS 
RLPKEVQLE 


" 5401 


3 


1360 


TG WS YGPTTSLAFLAPRDF P F P P KLLI H PQAWRLS CGAGS MGS 
QAAAEWRNWASWEGSSS LSGCS MGCFKDDRIVFWTWMFSTYFME 
Mi/\tri\SiULfnijC I VluuUjAXaisolS&wU^KlUuUSPE 
SKKLPGXiGDPDIDWEESVCLNLILQKLDYMVTCAVCTRADGGDl 
HIHKKKSQQVPASPSKHPMDSKGEESKISYPNIFFMIDSF\EE\ 
VFS DMTVG KGEMVCVELVAS DKTNTFQG VI FQGS IR YEALKKVY 
DNRVSVAARMAQK\MSFGFSKYSNMEF\VR\MKGPQGKGHABMA 
VSRVSTGDTS PCGTEEDSSPAS PMHBRVTSFSTPPTPERNNRPA 
PFSPSLKRKVPRNRIAEMKKSHSANDSBEFFREDDGGADLHKAT 
NLRSRSbS GTGRSLVGS W IJCIiNRADGN FLLYAHIiT YVTL P LHR I 
LTDILEVRQKPILMT 


5402 


3445 


1563 


gecfimaaWqqndlvfbfasnvmederqj^gdpAi fpAvi VEtfV^* 

PGADILNSYAGLACVEEPNDMITESSLDVAEEEIIDDDDDDITIi 
TVEASCHDGDETIETIEAAEALLNMDSPGPMliDEKRINNNIFSS 
PEDDMWAP VTHVS VTLDGI PEVMETQQVQEKYADS PGASSPBQ 
PKRKKGRKTKP PRPDS PATTPNI SVKKKNKDGKGNT I YLWEFIiIj . 
ALLQDKATCPKYI KWTQREKG I FKLVDS KP VS RLWR KHKN KP \D 
MNYE PMGRALR YYYQRG I LAKVEGQRLVYQFKEMPKDL IYI NDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVXK 
PGNSKAAKPKDPVEVAQPSEVLRTVQPTQSPYPTQLFRTVHWQ 

pvqavpegeaartstmqdetlnssvqsir\tiqaptqvpvwsp 
rnqq\lhtvtlqtvplttviastdpsagtgsqkfilqai pssqp 
^^vlkenvmi^sqkagsppsivrjgparv\qqvltsnvqticngt 
vsv\asspsfs\atapwtlfllgssqlvahppgtvitsvhctq 
etktltqevekkesedhlkentekteqqpqpyvmvvsssngfts 

QVAMKQNELLEPNS F 


5403 


3445 


1563 

• 


GEC FI MAAWQQNDLVFEFASNVW EDERQLGDP AI FPAV I VEHV 
PGADILNSYAGIACVEEPNDMITESSLDVAEEEIIDDDDDDITIj 
TVEAS CHDGDETIETIEAAEALLNMDS PGPMLDEKRINl^NI FSS 
PEDDMWAPVTHVSVTLDG ipevmetqqvqekyadspgass peq 
PKRK^RKTKPPRPDSPATTPNISVKKKNKDGKGNTIYLWEFtjL 
ALliQDKATCPKY I KWTQREKG I FKLVDS KPVSRLWRKHKNKP \ D 
MNYE PMGRALR YYYQRG I LAKVEGQRL VYQ FKBMPKDL I Yl NDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNSKAAKPKDP VEVAQPS EVI»RTVQPTQS P YPTQLFRTVHVVCJ 
PVQAVPEGBAARTS TMQDETLNSS VQS I R\TIQAPTQVPVWS P 
RKQQ\LHTVTLQTVPLTTVIAS TDPSAGTGSQKFILQAI PS SQP 
MTVUCENVMLQSQKAGSPPS I VLGPARV\QQVLTSNVQTI CNGT 
VSV\ASSPSFS\ATAPWTIiFLIiGSSQLVAHPPGTVITSVIKTQ 
BTKTLTQc^/lsKKESEDHLKENTE 
QVAMKQNELLEPNS F 


54 04 


167 


mi ' 


LPVTLIFAKMKTIK2STLLLLLLVPLIKPAPPTQQDSRIIYDYGT 
DNFEES I FSQDYEDKYLDGKNI KEKETVI I PNEKSLQ LQKDE AI 
TPLPPKKENDEMPTCLLCVCLSGSVYCEEVDIDAVPPLPKESAY 
LYARFNKIKKLT\AKDFADI PNLRRLDFTGMLIED IEDGTFS KL 
SLVEELSLABNQLLKLPVLPPKLTLF2«AKYWKIKSRGIKANAFK 
KLNNLTFLYLDHNALESVPLNLPBSLRVIHLQFNNIASITDDTF 


5405 


2199 


1220 


QNS RS LHMDPQNQHG SGSSLWI QQ PSLDS RPRLD YERE IQPTA 
ILSLDQIKAIRGSNEYTEGPSWKRPAPRTAPRQEKHERTHEII 
PINVNNNYBHRHTSHLGHAVLPSNARGPILSRSTSTGSAASSGS 
NS S ASS EQGLLGKS PPTRP VPGHRSERAIRTQPKQLI VDDLKGS 
LKEDLTQHKFICEQCGKCKCGECTAPRTLPSCLACNRQCLCSAE 
SMVEYGTCMCL\VKGIFYHCSNDDEGDSYSDNPCSCSQSHCCSR 
YLCMGAMS LFLPCLLC Y P PAKG CLKLCRRC YDW IHRPGCRCKNS 
NTVYCKLESCPSRGQGKPS 


5406 


279 


2732 


RWRTYNVEGPLTFMDVAIEFCLEEWQCLDTAQQNLYRNVMLENY 
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SEQ 
ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, P=*Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L»Leucine, M»Methionine, N=Asparagine , 
P»Proline, Q*»Glutamine, R»Arginine, 
S=Serine, T=Threonine, V=Valine, 
WaTryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /npossible nucleotide deletion, 
\=possible nucleotide insertion) 








RNLVFLG/ 1 IAVSKPDLITCLEQEKEPWEPMRRHEMVAKPPVMC 
SHFTQDFWPEQHIKDPFQKATLRRYKNCEHKNVHLKKDHKSVDE 
CKVHRGGYNGFNQCLPATQSK I PliFDKCVKAPHK FSNSNRHK 2 S 
HTEKKLPKCKECG1CSPCMLSHLAQHKIIHTRVNFCKCEKCGKAF 
NCPS IITKHKRINTGEKPYTCEBCGKVFNWSSRLTTHKKNYTRY 
KLYKCEECGKAFNKSSILTTHKJ IRTGEKFYKCKECAKAFNQSS 
NLTEHKKIHPGEKPYKCEECGKAPNWPSTLTKHKRIHTGEKPYT 
CBECG KAFNQFSNZjTTHKR I HTA\ EKF YKCTECGEAFSRS \SNL 
TKHKEIHTEKKP YKCBECG KAFKWSS KLTEHKLTHTGEKP YKCE 
KCGKAFNCPS I ITKHNR INTGE KP YTCEE CG KVFNWSSRLTTH K 
KNYTRYKLYKCEEOGKAFNKSSILTTHKXIHIEKXFYKCEECGK 
AFKWSSKLTEHKITHTGEKPYXCEECGKAFNHFSILTKHKRIHT 
GBKPYKCEECGKAFTQSSNLTTHKKIHl'GEKFYKCEECGKAFTQ 
SSNLTTHKKIHTGGKPYKCEECGKAFNQFSTLTKHiCI IHTEEKP 
YKCEECGKAFKWSSTLTKHKI IHTGEXPYKCBECG\ KAFKLSST 
LSTHKIIHTGEKPYKCEKCGKAFNRPSNLIEHKKIHTGEQPYKC 
EECGKAFNYSSHLNTHKRIHTKEQPYKCKECGKAFNQYSNLTTH 
NKIHTGEKLYKPEDVTVItiTTPQTFSNIK 


S407 


3 


659 


RPRRRQSSCCTGWLAGWLLRAAPRFCRRTETDMEQGKGLAVIilL 
A1ILLQGTLAQSIKGNHLVKVYDYQEDGSVLLTCDAEAKNITWF 
KIX3KMIGFLTEDKKXWNLGSNAKDPRGMYQCKGSQNKSKPLQVY 
YRMCQNCIELNAATISGFLFAEI VS I FDLAVGVYF I AGTGME FR 
QS \RASDKQTLLP \NDPAPTQPLKDPRKMTQYSHLQGN \QLRRN 


£408 


274* 


6128 


QGS KGTCHPQAQQ PWDEG VWQEAPSQ S E PWGQSQE P PTMPQRtiP 
HARQHTPLPLGSADYRRWSVRPQGPHRDPKDSRDAAKREQGSIj 
APR PVPAS RGGKTLCKGYRQAPPGPPAQ FQRP I CSAS PPWASR F 
STPCPGGAVREDTYPVGTQGVPSLALAQGGPQGSWRFLEWKSMP 
RIiPTDLDIGGPWFPHYDFERSCWVRAISQEDQLATCWQAEHCGE 
VRNXDMS WPEEMS FX ANS S KIDRHK VPTEKGATGLS MLGNTCFM 
NSS I QCVSNTQPLTQ YFI SGRHLYELNRTNP IGMKGHMAKCY0D 
LVQBLWSGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELLAFL 
LDGLHEDI^VHEKPYVELKDSDGRPDWEVAAEAVTDNHLRRNRS 
IWDLFHGQLRSQVKCKTCGHISVRFDPFNFLSLPLPMDSYMHL 
"EITVIKLDGTTPVRYGLRLNMDEKYTGLKKQLSDLCGLNSBQI l 
LAEVHGSNIKNFPQDNQKVRLSVSGFLCAFEI PVPVSPISASS P 
TQTDFSSS PSTNEWFTT/TTNGDLPR P I F I PNGMPN7W PCGTE X 
NFTNGMVNGHMPSLPDSPFTGYIIAVHRKMMRTELYFLSSQKNR 
PSLFGMPLI VPCTVHTRKKDLYDAVW I QVSRLAS PLPPQEASNH 
AQDCDDSMGYQYPFTIiRWQKDGNSCAWCPWYRFCRGCKIDCGE 
DRAFIGNAYIAVDWHPTALHLRYOTSQERVVDEHESVEQSRRAQ 
VE PXNLDSCLRAFTS EEELGENEMY YCS KC KTHCLATKKLDLWR 
LPPII»IIHLKRFQFVNGRWIK5QKIVKFPRESFDPSAFLVPRDP 
ALOQHKPLT PQGD E LS E PR I IiARE VKKVDAQSS AGEEDVLLS KS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKKSSPNSSPRTLGRS 
KGRLRLPQX GSXNKLS S SKENLDAS KENGAGQ ICELADALSRGH 
VLGGSQPELVTPQDHEVALANGFIjYEHEACGNGCGNGYSNGQLG 
NHS BEDSTDDQR EDTR I KP I YNLYAI SCHSGILGGGHYVTYAKN 
PNCKWYCYND3SCKBLHPDEIDTDSAYI LFYEQQGIDYAQFLPK 
TDGKKMADTSSMDEDFESDY\ EKYCVLQ 


5409 


2745 


6128 


WSKE^CMPOAQOPWDEGWQgAPSQSBPWGQSQEPPTMPQRLr" 
HARQHTPLPLG SAD YRR WS VR PQG PHRD PKDS RDAAXREQGS L 
APRPVPASRGGKTLCKGYRQAP PGP PAQ FQRP ICSAS PPWASRF 
STPCPGGAVREDTYPVGTQGVPSLALAQGGPQGSWRFLEWKSMP 
RLPTDIiDIGGPWFPHYDFERS CWVRAISQEDQLATCWQAEHCGE 
VRNKDMSWPEEMSFIANS SKI DRHKVPTBKGATGItSNLGNTCFM 
NSSIQCVSNTQPLTQYFISGRHLYELI^RTNPIGMKGHMAKCYGD 
LVQELWSGTQKNVAPLKLRWT IAKYAPRFNGFQQQDSQELLAFL 
LDGU^IOTVHEKPYVEUQOSDGRPDWBVAAEAWDNHLRRNRS 
IVVDLFHGQLRSQVKCKTCGHISVRFDPFNFLSLPLPMDSYMHI* 
EITVXKLDGTTPVRYGLRIiNMDEKYTGLKKQIfcSDLCGLNSEQI h 
LAEVHGSNIKNFPQDNQKVRJLSVSGPLCAFEIPVPVSPISASSP 
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SEQ 
ID 

NO: 


Predicted 
ucyi nzu ng 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(At»Alanine, (^Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F« Phenyl alanine, G-Glycine, 
H-Histidine, I«Isoleucine, K=Lysine, 
L«Leucine, M»Methionine, N=>Asparagine, 
P»Proline, Q«Glutamine, R=Arginine, 
S=Serine, T=Thxeonine, V=Valine, 
W=Tryptophan, Y=Tyro3ine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




• 




TQTDFSSSPSTNEMPTLTTNGDLPRP I FI PNGMPNTVVPCGTEK " 
NFTNGMVNGHM PS LPDS P FTGYI I AVHRKMMRTELYFLS SQKNR 
PSLFGMPIiIVPCTVHTRKKDLYDAVWIQVSRLASPLPPQEASNH 
AQDCDD SMGYQ YP FTLRWQKDGNS CAWCPWYRFCRGCKIDCGE 
DRAFIGNAYIAVDWHPTALHEaRYQTSQERWbEHEBVEQSRRAQ 
VE P I NIiDS CLRAPTSEEELGENEM YYCSKCKTHCIiATKKLDIiWR 
LPP1LI IHLKRFQPVNGRWI KSQKIVKPPRBS FDPSAFLVPRDP 
ALCQH KP LTPQGDELSE PR! LARBVKKVDAQSSAGEEDVLLS KS 
PSSLSANIISSPKGSP3S3RECSGTSCPSSKNSSPNSSPRTLGRS 
KGRLRLPQIGSKNKLSSSKENIiDASKENGAGQ I CELADALSRGH 
VLGGS Q P ELVTP QDHEVALANG PL YEHEA CGNG CGNG YS NG Q LG 
NHSEEDSTDDQREDTRIKPIYNLYAISCHSGILGGGHYVTYAKN 
PNCKWYCYNDSSCKELHPDEIDTDSAYILFYBQQGIDYAQFLPK 
TDGKKMADTS S MD EDFES D Y \ EKYCVLQ 


*410 

"^slii 


2 


710 


LRPPGQARHVWLAARMO^PHKEHLYKLLVIGDLGVGKTS IIKRY 
VHQNFS SHYRATIGVDFAliKVLHMDPETVVRLOLNDIAGQERFG 
NNfTRVYYREAMGAFIVFDVTRPATFEAVAKWKNDLDSKLSLPNG 
KP VS WLLANKCDQG KDVLMNNGLKMDQFCKEHG FVGW FE TSAK 

ENIKIDEASRCLVKHILANECDLMBSIEPDWKPHLTSTKVASC 
SG\CAKI LVGTFAGVW 




1302 


289 


TGPAAAGRRKALGS FGKPS PVTGLRAARRRRTRPSAPAAPS VGC 
G KRR ES DAGAGGE RAS VRTG S GRRGGRTMAGDS EQTLQNHQQ ?N 
GGEF FLIGVSGGTASGKS SVCAKI VQLLGQNEVDYRQKQ WILS 
QDSFYRVLTSEQICAKALKGQFNFDHPDAFDNEIiILKTLKElTEG 
KTVQI PVYDFV5HSRKEETVTVYPADVVLFEGI IiAFYSOBR / 1 R 
DLFQMKLFVDTDADTRLSRRVLKDI SERGRDLEQILSSSTLRFV 
KPA\ FEEFCLPPK\KYADVI I PR\GADN\R VPINLI VQH I Q\DI 
LNGGPS\NRQTNGCLNGYTPSRKRQASESSSRPH 


$412 


3180 


313 


QGISNFFHKEANFWFBVSGYLISPLRSPFVDPALEWSLMASPWN 
KMEGESSRFEIHTPVSDKKKKKCSIHKERPQKHSHE I FRDSSLV 
NBQSQlTRRKKRKKDFQHIilSSPLKKSRICDETANATSTLKKRK 
KRRYSALEVDEEAGVTVVLVDKBNINNTPKHFRKDVDVVCVDMS 
IEQKLPRK\ PKTDKFQVLAKSHXAHKSEAliHSKVREKKNKKHQR 
KAASWES QRA\RDTLPQSE FPTQEES WLS VGPGGE I TELP \ ASA 
HKNKS KKKKKKS SNRE YET \ LAM PEGS QAGRE AGTDMQESQPTV 
GLDDETPQLLGPTHKKKSKKKKKKKSNHQEFESIiAMPEGSQVGS 
EVGADMQES\RPAVGLHGETAGIPAPAYKNKSKKKKKICSNHQEF 
EAVAMPESLESAYPEGSQVGSEVGrVEGS TALKG FKESNSTKKK 
S KKRKLTSVKRARVSGDDPSVPS1WSBSTLF15SVEG3X3A>!MEEG 
VKS RPRQ KKTQACLAS KHVQEAPRLE PANEEHNVETAEDSE I R Y 
LSADSGDADDSEADIjGSAVKQLQEFT pni kdratsti krmyrdd 
lerfkefkaqgvaikfgkfsvxeneoqleknvedflaltgiesad 
kllytdrypeeksvitnlkrrysfrlhig\rniarpwkliyyra 
kxmfdvnnykgkysegdteklkmyksllgndwicrigemvarrsl 
svalkfsqissqrnrgawsksetrklikaveevilkkmspqelk 

BVDSKLQEMPESCLS IVREKLYKGI SWVBVEAKVQTRNWMQCKS 
KWTEILTKRMTNGRR I YYGMNALRAKVS LIERLYE INVEDTNE I 
DWEDLASAIGDVPPSYVQTKFSRIiKAVYVPFWQKKTFPEIIDYL 
YETTLPLLFCEKIjEKMMEKKGTKIQTPAAPKQVFPFRDIFYYEDD 
S EGGGHRKRKRRPRRHAWFTP V I PVLWEAKAGWI I 


5413 


3753 


1304 


RFPAGVAPRRAWIANVSKKVSWSGRJDRDDEBAAPLLRRTARPGGG 
TPIiLNGAGPGAARQSPRSALFRVGHMSSVKLDDETiLEPXDMDPP 
nr » r| *"* r nwoivijxioiji^x c*o iuu 1 ltjoiUN yij s jj&fc»liKRiNriTAFR 
TVEIKRWVICALIGILTGI4VACFIDIVVENLAGLKYRVIKGNID 
KFTEKGGLSFSLLLWATLNAAFVIiVGSVIVAFIEPVAAGSGIPQ 
IKCFUSGVKIPHVVRLKTLVIKVSGVILSVVGGIAVGKEGPMIH 
SGSVIAAGISQGRSTSLKRDFKIFEYLRRDTEKRDFVSAGAAAG 
VSAAFGAPVGGVLFSLEEGASFWNQFLTWRIFFASMISTFTLNF 
VI>3 tYHGNMWDLSSPGLINFGRFDSEKMAYTlHEIPVFIAMGVV 
GGVLGAVFNALNYWLTMFRIR YIHRPCLQVI EAVL-VAAVTATVA 
FVLIYSSRDC^}PLQGGSMSYPI^LFCA1)GEYNSMAAAFFNTPEK 



315 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted ~~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment: containing signal peptide 

Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
LoLeucine, M=Methionine, N-Asparagine, 
P-Proline, Q-Glutamine, R=Arginine, 
SaSerine, T=Threonine, V=Valine, 
W=Tryptophan, YsTyrosine, X=Unknown , *r=Stop 
Codon, /^possible nucleotide deletion, 
\ epos sible nucleotide insertion) 








SWSLFHDPPGSYNPLTLGLFTLVYFFIACWTYGLTVSAGVFTF" 
SLLIGAAWGRLFGISLSYLTGAAIMADPGKYALMGAAAQLGQIV 
RMTI^LTVIMMEATSNVTYGFP IMLVLMTAKI VGDVF I EGLYDM 
HIQLQ S VPFLHWEAP VTSHSLTARE VMSTP VTCLRRR E KVGV J V 
DVLSDTASNHNG FPWBHADDTQPARLQGLI LRSQI*I VI>LKHKV 
FVERSNI/3LVQRRLRLKDFRDAYPRFPPIQS IHVSQDERECTTCD 
LS E FMNPS P YTVP QEAS LPRVPKLFRALGLRHLWVDNRNQWG 
LVTRKDLARYRLGKRGLEBLSIAQT 


*414 


2130 


390 


GVASAWDRALFSPLLSPTSRVFRT3PPRCVSTETGRRDRARVTS 
QWCSVLQGKLPVSGRTS LACVRS ILLS PASS PRKVG I VGGTGAR 
AGAAPRDHGRVRHRRPSSARRMTRTTGQCLAPRGCQGPRGTRS P 
RSPRSRTRRGCSASPACLP/CRSALIVAVLCYINr.LNYMDRPTV 
AGVLPDIBQPFNIGDSSSGLIQTVFISSYMVIAPVFGYLGDRYN 
ka. x lAr W£ Li VTIiG SSFIPGSH ?WLLLLTRGLVGVGEA.S Y 
STIAPTLIADLFVADQRSRMLSIFYFAIPVGSGLGYIAGSKVKD 
MAGDWHWALRVTPGU3WAVLLLFLWREPPRGAVERHSDLPPL 
NPTSWWADLRAIiARNPSFVIjSSI^FTAVAFVTGSIALWAPAFLX 
RSRWLGETPPCLPGDSCSSSDSLI FGLITCLTGVLGVGIjGVEI 
SRRLRHSNPRADPLVCATGLLGSAPFLFLSLACARGSIVATYIF 
IFIGETLI*SMNt?AIVADILLx-WIPTRRSTAEAFQIVLSHLLGD 

agspyliglisdrij^rnwppsflsefralqfslmlcafvgalgg 
aaflgtahlh 


5415 " 


693 

■i 


298* " 


IPPKTKLELQKH\LTTLT\NQEQATIFEEVQKLRPRNEQRENEL 
I ISFLRCLFEBKQKEHIHIGEMKQTSQMAAENIGS ELPPSATRF 
RLDMLKNKAKRSLTES LE S I LSR GNKARGLQBKS ISVDLDSSI*S 
STLSNTSKEPSVCBKEALPISESSFKLLGSSEDLSSDSESHLPE 
EPAPLSPQQAFRRRANTLSHFPIECQEPPQPARGSPGVSQRKLM 
RYHSVSTETPHERKDFESKANHLGDSGGTPVKTRRHSWRQQIFL 
RVATPQKACDSSSRYEDYSELGELPPRSPLEPVCEDGPFGPPPE 

ekkrtsrelrelwqkailqqiu^lrmekenqklqasendliWr 
lkld ye e i tpclke vttvwekmlstpgrs ki kfdmekmhsavgq 

V3vt*\nrtiiKUisiWitr LA^FHJjKHQFPSKQQPKDVPYKELXiKQIjT 
SQQHAILIDLOTTFPTHPYFSAQLGAGQLtSLYNILKAYSLLDQS 

Vgycqglsfvagilllhmseeeafkmlkflmfdmglrkqyrpdm 
I ilqiq^qlsrllhdyhrdlynhlbeheigpslyaapwfltmf 

ASQFPLGFVARVFDMIFI^TEVXFKVALSLIiGSHKPLIJLQHEN 

letivdfikstlpnlglvqmektinqvfemdiakqlqayeveyh 
vi^eelidssplsdnqrr^klektnsslrkqnldlleqlqvang 
riqsiieatiekllsse8klkqamltlelersaiiqtveelrrrs 
akpsdrepectqpeptgd 


5416" 


27 


4074 


KSQLFCFWGGKAGDILSGDQDKEQKDPYFVBTPYGYQLDLnFLK 
YVDDIQKGNTIKRLNIQKRRKPSVPCPEPRTTSGQQGIWTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPIiETSLPFLTIP 
ENRQLP P P S PQLPKHNLHVTKTLMETRRRLEQERATMQMTPGE P 
RRPRLAS FGGMGTTS SLPSFVGSGNHNPAKHQLQKG YQGNGD YG 
SYAPAAPTTSSMGSSIRHSPLSSGISTPVTNVSPMHIiQHIREQM 
AI ALKRLKELEEQVRTI PVLQVKI SVLQEEKRQLVSQLKNQRAA 
SQINVCGVRJOISYSAGNASQLEQLSRARRSGGELYIDYEEEEME 
TVEQSTQRI KEFRQL \ TADMQALEQK IQDSS CEAS SELRENGEC 
RSVAVGAEENMNDIVVYHRGSRSCKDAAVGTLVEMRNCGVSVTE 
AMLGVMTEADKE I BLQQQT I ES LKEK I YRLEVQLRETTHDREMT 
KL KQELQAAGS RKKVD KAIWAQ P LVFS KW EA WQTRD QMVG S H 
MDLVOTCVGTSVETNSVGISCQPECKNKVVGPELPMNWWIVKER 
VEMHDRCAGRS VEMCDKS VSVE VS VCETGSNTEES VUDLTLLKT 
NLNLKEVRS IGCGDCSVDVTVCSPKECASRGVNTEAVSQVEAAV 
MAVPRTADQDTSTDLEQVHQPTNTETATL IES CTNTCLSTTjDKQ 
TSTQ TVETRTVAVGBGRVKD INS STKTRS IG VGTLLS GHSGFDR 
P SA VXT KES GVG Q IN I NDNYL VG LKMRT I ACGP PQLTVGL TAS R 
RS VGVGDDP VGE SIiBNPQ PQAPIjGMMTGLDHYIER IQKLLAEQQ 
TLLAENYSELAEAFGE PHSQMGSLNSQLlSTIiSS INSVMKSAST 
EELRNPDFQKTSLGKITGSYLGYTCKOGGLQSGSPLSSQTSQPE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
iA-Aianine, cocysceme, D=Aspartic Acid, Ea 
Glutamic Acid, F=Phenyl alanine, G^Glycine, 
H=Histidine, Ialeoleucine, K=Lysine, 
LsLeucine, Methionine, N-Asparagine, 
P«Proline, Q«Glutamine, R-Arginine, 
S -Serine, T=*Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=. Unknown , **Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








QEVGTSKG KPISSLnAFPTQEGTLSPVNLTODQIAAGLVACTNN 

PQ1*T.lf CTMtrYEm^rkTVnr>VTS"»1t tTTTXTT nr>1 rn r iTnniir>mnn nv^v^Ma* mm 

i u*«J.W1UUUA3WaUSNGAiUvNLQFVGI 
BSSSSESDDECDVIEYPLEBEEEEBDBDTRGMAEGHHAVNIEGIj 
KSARVEDEMQVQECEPEKVEI RERYELS EKMLSACNLLKNT IND 
PKALTSKDMRFCIiNTLQHEWFRVSSQKSAlPAMVGDYIAAFEAI 
SPDVLRYVTNLADGNGNTALHYS VS HSNFE X VKLLLDADVCNVD 
HQN2CAGYTP I M LAALAAVEAE KDMRIVEELFGOGDVNAKASQAG 
QTALf^VSHGRIDMVKGLLACGADVNlQDDEGSTALMCASEHG 
HVEIVKLLLAQPGCNGHLEDNDGSTALS I ALEAGHKD I AVTiL YA 
HVNFAKAQSPGTPRIiGRKTSPQPTHRGSFD 


5417 


27 


4074 


XSQLFCFWGGXAGDILSGDQDKEQKDPYFVETPYGYQLDLDFLK 

YVDDIQKGNTIKRIiNIQKRRKPSVPCPEPRTTSGQQGIWTSTES 

L8SSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 

ENRQLPP PS PQLPKHNLH VTKTLMETRRRLEQERATMQMT PGBF 

RRPRLASFGGMGTTSSLPSFVGSGNHNPAKHQLQNGYQGNGDYG 

SYAPAAPTTSSMGSSIRHSPLSSGISTPVTNVSPMHLQHIREQM 

AIALICRLKELEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRAA 

SQINVCGVRKRSYSAGNASQLEQLSRARRSGGBLYIDYEEEEMB 

TVEQ3TQR3 KEFRQIi\TAOMQALEQKIQDSSCEASSEliRENGEC 

RSVAVTGAEEN14NDIVVYHRGSRSCKDAAVGTLVEWRNCGVSVTB 

AMLGVMTEADKEIELQ0X5TIESLKEKIYRLEVQLRETTHDREMT 

KLKQSLQAAGSRKKVDKATMAQPLVFSKWEAVVQTRDQMVGSH 

MDLVDTCVGTS VETNS VG ISCQPECKNKWGPELPMNWWI VKER 

VEMHDRCAGRSVEMCDKS\^VEVSVCETGSNTEESVNDLTLLKT 

NLNLKEVRSIGCGDC9VDVTVCSPKECASRGVNTEAVSQVEAAV 

MAVPRTADQDTSTDIiBQVHQFTNTETATLZESCTNTCLSTIiDKQ 

TSTQTVETRTVAVGEGRVKDINS STKTRS IGVGTLLSGHSGFDR 

PSAVKTKESGVGQINI NDNYLVGLKMRT IACGPPQI .TVGLTASR 

RSVGVGDDPVGESLENPQPQAPLGMMTGLDHYIER1QKLLAEQQ 

TLLAENYSEIAEAFGEPHSQMGSLNSQLISTLSSINSVMKSAST 

EELRNPDFQKTSLGKITGSYLGYTCKOGGLQSGSPLSSQTSQPE 

QBVGTSEGKP ISSLDAPPTQEGTLSPVNLTDDQIAAGLYACTNN 

ESTLKS IMKKKDGNKDSNGAKKNLQFVGINGGYETTSSDDSS SD 

ESS S S ESDDBCDVIE YPLEEEEEEEDBDTRGMAEGHHAVNI EGli 

KSARVEDEMQ VQECEPE KVEIRER YEiSBKMIiSACNLIiKNTIND 

PKALTSKDMRFCIiNTLQHEWFRVSSQKSAIPAIWGDYIAAFEAI 

SPDVLRYVINLADGNGNTALHYSVSHSNFE I VKLLLDADVCNVD 

HQNKAG YTP I MLAALAAVEAEKDMR I VEELFGCGD VNAKAS QAG 

QTALMIlAVSHGRlD^^GLIlACGAJ^VNIQDDEGSTAl^CASEHG 

HVBIVKLIXAQPGCNGHLEDNDGSTALSIALEAGHKDIAVLIiYA 

HVNFAXAQSPGTPRLGRXTSPGPTKRGSFD 


5418 


24 




a VFKAUUDMiJTGAABLYDQALIiGI IiQHVGNVQDFLRVLFGFLYR 
KTDFYRLLRHPSDRMGFPPGAAQALVLQVFKTFDHMARQDDEKR 
RQELEEKrRRKEEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTEL 
DGHQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGAAEVPR\EP?I 
LPRIQEQFQKNPDSYNGAVRENYTWSQDYTDIiEVRVPVPKHWK 
G KQVSVALSSS S I R VAM t»EEKGERVLM3GKLTHKINTES S L WS L 
EPGKCVLVNLSKVGEYWWNAI LEGEEPXDIDKINKERSMATVDE 
EEQAVLDRLTFDYHQKUK5KPQSHELKVHEMLKKGWDAEGSPFR 
GQRFDPAMFNISPGAVQF 


5419 


1355 


5kq 


GTHP LDPDLVSRTS VQGP LMTMACPGMS DTE ES PFLGPRAAEEG 
SESEACEAFGRRiCSEEEGRRSDTSGFGRSRKHKVNWKHPERADA 
KDPASLPQC/LGP/IX!VRPAQPSSKYCSDDCGMKLAANRIYEIL 
PQRIQQWQQSPCIAEEHGKKLliERIRREQQSARTRLQEMERRFH 
ELEAIILRAKQQAVREDEBSNEGDSDDTDLQIFCVSCGHPINPR 
VAIiRHMERCYAKYESG/rSFGSMYP^IEGATRLFCDVYNPQSKT 
YCKRLQVLCPEHSRDPKVPADEVCGCPLVRDVFELTGDFCRLPK 
R0^MRHYCWEKLRRAE\^LERVRVOTK1^EI<FEQER1JVRTAMTN 
RAGLLAL ML HQ T I QKD P LTTDL RS S ADR 


542b ■ 


117 


1733 


NBAGGACPFKGGASGRLYLSPRIiPRVSVAGCEERPLGWVWVLGG ' 
GGFLPARPPRAQRHLGFSHAEQSMBAPDYEVLSVREQLFHBRIR 
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SEQ 
ID 
NO: 


Preoicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


j Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
reoidue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide ] 
ifloiuwiiiic, i--<jysceine, DaAspartic Acid, E» 
Glutamic Acid, ^Phenylalanine, G=Glycine, | 
H=Histidine, I^Isoleucine, K» Lysine, 
L=>Leucine, M«Methionine, N-Asparagine , j 
P=Proline, Q=Glutamine, R-Arginine, 
S»Serine, T-Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) j 








ECIISTLLFATIiyiLCHIF l L^RFKjCPAEFTl i \GMMKMPP"S^RL7 1 
LLELCTFTLAI ALGAVLLLPFS I ISN3VLLSLPRNY Y I QWLNGS 
IilHGLWNLVFLFSMIjSI.TPLMPPftV7ri7TT?ei?rti?ar»eovi^trr r*n\r I 

YETVVMLMLLTLLVLGM VWVASA I VDKNKANRESLYDFWE YYLP 
YLYSCISFI/5VLLLLVCTPLGLARMFSVTGKLLVKPRLLEDLEE 
QLYCS AFBEAALTRRI CNPTSCWLPLIl>lELLHRQVliALQTQRVL 
LEKRRKASAWQRNLGYPLAMLCLLVLTGLSVLIVAIHILELLID 
EAAMPRGMQGTSLGQVSFSJCLGSFGAVIQVVLIFYliMVSSWGF 
YSSPLFRSLRPRWHDTAMTQIIGNCVCLLVLSSALPVFSRTliGL 
TR FOLLGDFGR FNWLGNF Y I VFLYNAAFAGLTTLCLVKT FTAAV 
RABL IRAFGBRE | 


S421 


117 


1733 


NEAGGACPFKGGASGRLYLSPRLPRVS^AGCEBR?LGWVWVLGG 
GGFLPARP PRAQRHLG FS HAEQSMEAPD YE VLS VREQLFHE R IR 
ECU STLLFATLYI LCH I FLTRFKKPAEFTT \GMM KMPPSTRI*/ 
LLELCTFTLAIALGAVLLLPFS II SNEVLLSLPRNYYIQWLNGS 
LIHGLWNLVFLFSNLSLI FLMPFAYFFTESBGFAGSRKGVLGRV 1 
YETVVMLMlXTLl.VLGMVWVASAIVDKNKANRESLYDFWEYYIiP 
YLYSC I SFLGVLLLLVCTPLGLARMFSVTGKLLVKPRIiLEDLEE 
QLYCSAFEEAALTRRICNPTSCWLPLDMELLHRQVLALQTQRVIi 
LEKRRKASAWQRNLGYPIiAMIjCLLVLTGIiS VLIVAIHILELL I D 
EAAMPRGMQGTS LGQVS FSKLGSFGAV I QWLI F YLMVS SWGF j 
YSSPLFRSLRPRWHDTAMTQIIGNCVCLLVLSSALPVFSRTLGL 
TRFDIjLGDFGRFNWLGNFYIVFLYNAAFAGLTTLCLVKTFTAAV 
RAELIRAFGERE | 


5422- 


3 


1263 


SCGBSLPTWIiAGASRPGIGRKGGAWGGRGGSSPAQVLLSPGPVF I 
KAGCNWWHIiSRDQAGVQRCDLGSSQPPPLGFKRFSCLSLPSSWD I 
YRSTVLCVSKMFJU>LSGFNIDAPRWDQRTFIX5RVIMFliNITDPR 
rVFVSERELDKAKVMVEKSRMGWPPGTQVEQIiI iYAKKLYDSAF 
HPDTGEKMNVIGRMSFQLPGGMIITGFMIiQFYRTMPAVlFWQWV 
N(^FNAiVNYTNRNAASPTSVRQMALSYFTATTTAVATAVGMNM 
LTKKAPPLVGRWVP FAAVAAANCVNI PMMRQQBLZ KGICVKDRN 
ENEIGHSRRAAAlGITQWISRITMSAPGMiriLPVIMERIjEKLH 
FMQKVKVL/SAPLQVMIiSGCFIilFMVPVACGLFPQKCELFVSYL 
EPKLQDTIKAKYGELEPYVYFNKGL *" | 


5423 


3186 


^ 905 


GVSMALGEEKAEAEASEDTKAQSYGRGSCRERELDI PGPMSGEQ f 
PPRLKAEGGLIS PVWGAEG I PAP TCWI GTD PGG PS RAHQ PQASD I 
ANRE P VAERS E PALSGLPPATMGSGDIiLLSGES Q VEKTKLS S SE 
EFPQTLSLPRTTICSGHDADTEDDPSLAOLPO^LDLSQQFTOSG 
LSCXSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQBRAEPRG 
GSLAKVSSSLEPWPQEPSSWGLGPRPQWSPQPVFSGG0A5GI* 
GRRRLSFOABYWACVLPDSLPPSPDRHSPLWNPNiOSYEDIiLDYT 1 
YP LR PGPQLPKKLDS rvpadpvlqdsgvdldsfsvs PASTLKS P 
TNVSPNCPPAEATALPFSGPREPSLKQWPSRVPQKQGGMGIASW 

oyuHO x trt^t\f^i>H.UJ\K.V4 ^KK kPAJ^txAi^RkxXGlGfLDMGSP QIi [ 

RTRDRGWPSPRPEREIOZTSQSARRPTCTESRWKSEEEVESDDEY 
IJU^PARLTQVSSLVSYLGSISTLVTLPTGDIKGQSPLEVSDSDG 
PASFPSSSSQSQLPPGAALQGSGDPEGQNPCFLRSFVRAHDSAG 
EGSE/3SSQALGVSSGLLKTRPSLPARIiDRWPFSDPDVEGQtiPRK 
GGEQGKE S LVQC \ VKTFC\ CQLEEIiICFTIi YNV\ AD VTDHGTPAR 
SNLTSLK\SSLQLYRQFKKDIDEHQSIiTESVLQKGEILLQCLLE 
NT P VLB DVLGRIAKQSGELESHADRLYDS I LAS LDMLAGCTL I P 
DKKPMAAMEHPCEGV 


5424 


3186 


905 


GVSMALGEEKAEAEASEDTKAQSYGRGSCREREIiDIPGPMSGEQ 
P PRLEAEGGL IS P VWGAEGIPAPTCWIGTDPGGPSRAHQ/PQAS D 
ANREPVAERSEPALSGL PPATMGSGDLZiLS GESQVEKTKLSSS E 
EFPO/TLSLPRTTICSGffl3ADTEDDPSLADIiPQALDLSQQPHSSG 
LSCLSQWKS VLS PGSAAQPSS CS ISASS TGSS LQGHQERAEPRG 
GSIiAKVSSSLEPWPQEPSSWGLGPRPQWSPQPVFSGGDASGI. 
SRRRLS FQAEYWACVLPDSLPPS PDRHSPLWNPNKEYEDLLDYT 
YPLRPGPQLPKHLDSRVPADPVLQDSGVDLDSFSVS PASTLKS P 
rNVSPNCPPAEATALPFSGPREPSLKQWPSRVPQKQGGMGLASW j 
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amino acid 
residue of 
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nucleot idf» 
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corresponding 
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Amino acid segment containing signal peptide 
iA-iuainne, L^uysceine, D=Aspaxtic Acxd, E= 
Glutamic Acid, F= Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, X=Lysine, 
L«*Leucine, M=Methionine, N«Asparagine # 
PoProline, Q^Glutamine, R=Arginine, 
S=Serine, "^Threonine, V^Valine, 
WoTryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








*WWw * if KUAK W BaHX a e ALKUAiUJRLI XGKHLDMGSPQL 
RTRDRGWPSPRPEREKRTSQSARRPTCTESRWKSEEBVESDDEY 
LAL PARLTQVSS LVS YLGS ISTLVTLPTGD I KGQS PLEVSDSDG 
PASFPSSSSQSQLPPGAALQGSGDPEGQNPCFLRSFVRAHDSAG 
BGSLGSSQALGVSSGLIiKTRPSLPARLDRWPFSDPDVEGQLPRK 
GGEQGKESLVQC\VKTFC\CX3LEELICWLyNV\ADVTDHGTPAR 
SNLTSLKNSSLQLYRQPKXDIDEHQSLTESVLQKGBILLQCLLE 
NT P VLEDVLGR I AKQSGE LESHADRLYDS I LAS LDMLAG CTLI P 


5425 


1086 


115 


GFCPSPSLGHQPPRVLHPTMSJ^VETFtSFFMATVGLLMLGVTI^P - 
NS YWRVSTVHGNVITTNT I FENLWFSCATDS LGVYNCWE FPSML 
AliSGYIQACRALMITAILLGFLGLLIiGIAGIjRCTNIGGLELSRK 
AXItAATAGAPH \ ILPG I OGMVAI \ S WYAFNI TR \ DFSDPL YPGT 
KYE LG PALYLGWS ASL I S I LGGLCLCS ACCCGS DEDP AAS ARRP 
YQAPVSVMPVATSDQEGDSSFGKYGRNALRVAALCRGPRCLPTA 
PKKRGPGRGP FP YSNLRGRPRP VPVAPPRPRPRVLHSHGPSQAK 
NCSWEVAYLPSEAGSLIF 


5426 


" ■ 42 


3435 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDQPSAPSDPTDQP" 
PAAHAKPDPGSGGQPAGPGAAGEAIiAVLTSFGRRLLVLI PVYLA 
GAVGLSVGFVLFGIiAIjYLGWRRVRDEICEHSIjRAARQLLDDEEQL 
TAKTLYMSHRELPAWVSFPDVEKAEWLNKIVAQVWPFLGQYMEK 
LLAETVAPAVRG SNPHLQTFTFTRVELGEKPLR I IGVKVHPGQR 
KEQILLDLNISYVGDVQIDVEVKKYFCKAGVKGMQLHGVLRVIL 
EPLlGDLPFVGAVSMFFIRRPTXiDINWTGMTNLLDIPGLSSLSD 
TMIMDSIAAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGI IRIHZi 
LAARGLS SKD KYVKGL I EG KSDP YALVRLGTQTFCSR V 1DEELN 
PQWGETYEVMVHEVPGQE I EVEVFDKDPDKDDFLGRMKLDVGKV 
LQASVlJDDWFPLQGGQGQVHLRLKWIaSLLSDAEKLEQVLQWNWG 
VSSRPDPPSAAILWYLDRAQDLPMVTSELYPPQLKKGNKEPNP 
MVQLSIQDVTQESKAVYSTNCPVWEEAFRFFLQDPQSQBLDVQV 
KDDS RALTLGALTLPLARLLTAPELIIiDQWFQ LSS SG PNSRL YM 
KLVMRILYLDSSBICFPTVPGCPGAWDVDSENPORGSSVPAPPR 
PCHTTP DSQFGTEHVLR IHVLEAQDLIAKDRFJjGGLVKG KSDP Y 
VKLl^GRSFRSHVVREDLNPRWNEVFEVIVTSVPGQBLEVEVF 
DKDLDKDDFl^RCKVRLTTVLNSGFLDEWLTLBDVPSGRLHIjRD 
ERLTPRPTAABLEEVIiQWSLIQTQKSAELAAALLSIYMERAED 
LPLRKGTKHLS P YATLTVGDS SHKTK7 ISQTSAP VWDES AS FLI 
RKPKTESLEXQVRGEGTGVLGSLSLPLSELLVADQLCXDRWFTL 
S SGQGQ VLLRAQLG IL VS QHSG VEAHSHS YSHS SS S L SEE PELS 
GGPPHI TSSAPEV\RQRI»THVDSPLEAPAGPLGQVKLTXW YYSE 
ERKLVS I VHGCRSLRQNGRDP PDPYVSLLiLLPDKNRGTKRRTSQ 
KKRTLS PEFNERFEWELPLDEAQRRKLDVSVKSNSS FMSREREL 
LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5427 


42 


3435 


ATSSQSLGRADPPRC^TMERgPGEGPSPSP^QPSAPSDPTDQP^ 

paahakpdpgsggqpagpgaagealavltsfgrrllvlipvyla 
g avgls vg fvlfglalylg wrrvrde kers lraarq lldde e ql 
taxtl ymshrelpawvs fpdvekaewlnkivaqvwp flgqymex 

LLAETVAPAVRGSNPHLQTFTFTRVEIiGEKPLRI IGVKVHPGQR 
KEQILLDLiNISYVGDVQIDVEVKKYFCKAGVKGMQLHGVLRVIL 
EpLIGDLPFVGAVSMFFIRRPTLDINWTGMim.LDIPGLSSLSD 
TMIMDS I AAFLVLPNRLLVPLVPDLQDVAQLRS PLPRGI XR I HL 
IMRGLSSKDKYVXGLIEGKSDPYALVRLGTQTFCSRVIDEELN 
PQWGETYEVMVHEVP<^EIEVEVFDKDPDKDDFLGRMKLDVGKV 
LQAS VLDDWFPLOGGQGQVHLRLEWIiS LLS DAEKLEQ VLQWNWG 
VSSRPDPPSAAILWYLDRAQDLPMVTSELYPPOLKKGNKEPNP 
MVQLS IQDVTQES KAVYSTNCPVWEEAFRFFLQDPQSQEIiDVQV 
KDDSRALTLGALTLPLARLLTAPELILDQWFQLSSSGPNSRLYM 
KLVMRILYLDSSBICFPTVPGCPGAWDVDSENPQRGSSVDAPPR 
PCHTTPDSQFGTBHVLRIHVLEAQDLIAXDRFLGGLVXGKSDPY 
VKLKLAGRS FRSHWREDLNPRWNEVFEVI VTSVPGQELEVBVF 
DIO)LDKDDFliGRCKVRLTT\mNSGFIiDEWLTLEn5VPSGRUaRL 
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nucleotide 
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to first 
amino acid 
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I Amino acia segment containing signal peptide - 
v" nxdinne, <-=uysceine, D=Aspartic Acid, E=» 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
HoHistidine, I-Isoleucine, K- Lysine, 
L^Leucine, M=Methicnine, N»Asparagine, 
P=Proline, Q-Glutamine, R«Arginine, 
S-Serine, T=*Threonine , V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion. 

\ = DOSsible nilfl f*n\~l <So {noort-'innl 








ERLTPRPTAAfiLEBVI^VNSLIQTQKSAELAAAJXSIYMBRAED 
LPLRI^TKHLSPyATLTVGDSSHKTKTISQTSAPWDESASFIiI 
RKPHTBSLBLQVRGEGTGVLGSLSLPLSELBVADQLCLDRWFTL 
SSQQGQVLLRAQl>GILVSQHSGVEAHSHSYSHSSSSLSEEPELS 
GGPPfrrTSSAPEVXRQRLTHVDSPLEAPAGPLGOVKLTLWYYSE 
ER KLVS IVHG CRSLRQNGR DPPDP YVSLLLLPDKNRGTKRRTS Q 
KKRTLSPEFNE^FEWELPLDEAQRRKLDVS VKSNSS FMSREREL 
LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5428 
5429 


3 


1839 


SSRSERLSACAIAPPWLVSSRPARPAQLQRPGKMVEDGAEELED 
LVHFSVSELPSRGYGVMEEIRRQGKLCDVTLKIGDHKFSAHRIV 
LAAS I PYFHAMFTNDMMECKQDEI VMQGMDPSALEALINFAYNG 
NLAIDQQNVQSLIxMGASFLQLQSIKDACCTFLRERLHPKNCLGV 
RQFABTMMCAVLYDAANSFIHQHFVEVSMSEEFLALPLEDVLEL 
VSRDELNVKSEEQVFEAALAWVRYDREQRGTPL\RNLQSNIRLL 
FCR PQFLSDRVQQDDLVRCCHKCRDL VDEAKD YL LMP ERRPHLP 
AFRTRPRCCTS IAGLIYAVGGLNSAGDSLNWEVFDP I ANCWER 
CR PMTTAR SR VG VAWNG LL YAI G G YDGQLRLS TVQAYNTE T DT 
WTR VG S MNS KRS AMGTWLDGQ I YVCGG YDGNS S I>SS VE T Y S PE 
TDKWTWTSMSSNRSAA\GVTVFEGRIYVSGGHDGLQIFSSVEH 
YNHHTATWHPAAGMLNKRCRHGAASLGSKMFVCGGYDGSGFLS1 
AEMYSSV\ADQMCLIVPM\HTRR\SRVSLGGPAVGRLYAVWGVT 

TGQSNL\ S S VGDVLTPETDCWTFM \ APMACHEGGVGVGC I PLI*T 
I 




828 - 


202 


RREDALSSEG CLWPSEST VSGNG I PEPQVYAPPRPTDRLAVP P F 
AQRERFHRFQPTYPYLQHEIDIiPPTISLSDGEEPPPYQGPCTIjQ 

lrdpeqqleiinresvrappnrtlfdsdlmdsarlggpcppssns 
gisatcygsggrmegpppXtysevighypgssfqhqqssgppsl 
legtrlhhth i aples aai ws kekdxqkghpl 


5430 
5431 * 


441 


1^07 


QKRRKRRRKKIMKTIQPKHHNSISWAI FTGLAALCLFQGVPVRS 
GDATFPKAMDNVTVRQGE SATLRCT IDNRVTRVAWLNRSTI LYA 
GNDKWCLDPRVVLLSNTQTQYSIEIQNVDVYDEGPYTCSVQTDN 
HPKTSRVHLIVQVSPKIVEISSDISINEGNNISLTCIATCRPEP 
TVTWRHISPKAVGPVSEDBYLEIQGITREQSGDYECSASNDV\A 
APV\VRRVKVTVNYPPYISEAKGTGVPVGQKGTLQCEASAVPSA 
EFQWYKDDKRLI /EGKKG VKVENRPFLS KLIFFNVSEHDYGN YT 

CVASNKLGHTNASIMLFGPGAVSEVSNGTSRRAGCVWLLPLLVL 
HLLLKF 




2 


1312 


AAAAPG SRRRR PLPDRPHMAHGYEAPPPPA PRSPAWKARSKP V ^ 
ijtHjX l IWP \TIAEGPSP \TSEGASEANLVDLQKKLEELELDEQQ 
KKRLEAFLTQKAKVGEIjKDDDFERISEIjGAGNGGVVTKVQHRPS 
GLIMARKLIHIjEIKPAIRNQIIRELOVLHECNSPYIVGFYGAFY 
S DGE I S I CMEHKDGGS LDQVLKEAKR IPEEILGKVS IAVLRGLA 

ylrekhqimhrdvkpsnilvnsrgeiklcdfgvsgqlidsmans 
FVGTRS Y^PERL(^ wtysvqsdi wskgls lvelavgr yp ippp 
dakeleaifgrpvvdgeegbphsisprprppgrpvsghgmdsrp 
amaifelldyivnbpppklpngvftpdfqefvnkcliknpaera 

0LKMLTNHTFIKRSEVEEVDFAGWLCKTLRLNQPGTPTRTAV 


5432 
5433 


2 


1312 

1 


AAAAPGSRRJiRPLPDRPHMAriGYEAPPPPAPRSPAWRARSKPV^ 
• Jjro<L \ a AAeiai'of \XomoASEANIiVDLQKKLBELELDEQQ 
KKRLEAFXTQKAKVQELKDDDF^RISEI.GAGNGGWrKVQHRPS 
GLIMARKLIHLEIKPAIRNQI I RELQVLHE CMS P Y I VG FYGAF Y 
S DGE I S I CMEHMDGGSLDQVLKEAKRI PEE I LGKVS IAVLRGLA 
YLREKHQIMHRDVKPSNILVNSRGEIKLCDFGVSGQLIDSMANS 
FVGTRSYMAPERLQGTHYS VQSDI WSMGLSLVELAVGRYPI PPP 
DAKELEAIFGRPWDGEEGEPHSISPRPRPPGRPVSGHGMDSRP 
AMAIFEiLDYIVNEPPPKLPNGVFTPDFQEFVNKClilKNPAERA 
DLKMLTNHTFI XRSEVBEVDFAG WLCKTIiRLNQPGTPTR TAV 




360 " 


1885 

1 
I 


3 VQED KVGFED PLHLCS WRARACPCTW PHC j CTG LLECLGFAGV 
jFGWPSLVFVFKNEDYFKDLOSPDAGPIGNATGOADCKAQDERF 
5LIFTIX5SFMNNFMTFPTGYIFDRFKITVARLIAIFFYTTATLI 
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Amino acxd segment containing signai peptide 
(A*Alanine, C«CyBteine, D»Aspartic Acid, E« 
Glutamic Acid, F- Phenyl a la nine , G=Glycine, 
K»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline. Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, V^Valine, 
WaTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /«=possible nucleotide deletion, 
\=possible nucleotide insertion) 








IAFTSAGSAVLLPIjAMPMLTIGGIIjFLITNLQIGNLFGQHRSTI 
ITLYNGAFDSSSAVPLI IKLLYEKGISLR/VHiHLHLCLQYTAC 
STHFPPDAPGAHPIPTAPQLQLWPVPWEWHHKGREJG/QQLSMKT 
GS YSQRSSFQRRKRPQGQGRSRNSAPSGATL/ csrrpawhlvwl 
SVIQLWHYLF1GTLNSLLTOIMAGGDMARVSTYTNAFAPTQFGVL 
CAPWNGrJ^RLKQKYQKEARKTGSSTLAVALCSTVPSIALTSL 
LCIjGFALCASVPILPLQYLTFILQVISRSFLYGSNAAFLTIAFP 
SEHFGKLFGLVMALSAWSLLQFP1FTLIKGSLQNDPFYVNVMF 
MLAILLTFFHPFLVYRECRTWKESPSA1A 


5434 


66 


652 


RYAALIISLIQHKLLWRNQHCSRCVIMSPAQSAGLNWLF /GSGK " 
HGPFLGCSQYPACDYVRPLKSSADGHIVKVLEGOVCPACGANLV 
LRQGRFGMFI G CINYPK CEHTB L I DKPDETA I TCPQCRTGHhVQ 
RRSRYGKTFHSCDRYPECQFAINFKPIAGBCPECHYPLLlEKlcr 
A0GVKHFCASKQC3GKPVSAE 


543S 


4 704 


1S97 


PGDSSQRLAEMSNAKERKHAKXMRNQPIWVTLSSGFVADRGVKH 
HSGGEKPFQAQKQEPHPGTSRQRQTRVNPHSLPDPEVNRQSSSK 
GMFRKKGG WKAGPEGTSQEI PKYI TASTFAQARAAEISAMLKAV 
TQKSSNSLVFQTLPRHMRJUy^SHNVKRLPRRLQEIAQKEAEKA 
VHQKKEHS KNKCHKARJICHMNRTI»E FNRRQKKN I WLBTH I W1IAK 
R FHMV KKWG YCLGERPTVKS HRACYRAMTNRCI»LQDLS Y YCCLi E 
LKGKEEEILKALSSMCNIDTCSLTFAAVHCLSGKRQGSLVIiYRVN 
KYPREMLGPVTFIWKSQRTPGDPSESRQLWIWIiHPTLKQDIIiEE 
IKAACQCVEPIKSAVC1ADPLPTPSQEKSQTELPDEKIGKKRKR 
KDDGBNAKPIKKI IGDGTRDPCLP YSWIS PTTGI 1 1 SDLTMEMN 
RFRLIGPIiSHSILTEAIKAASVHTVGEDTEETPHRWWIETCKKP 
DS VS LHCRQEAI FELliGG I TSPASI PAGTILGLTVGDPRINL PQ 
KKSKALPNPEKCQDNBKVRQLLLEGVPVECTHS F I WMQDI CKS V 
TENKISDQDLNRMRSELLVPGSQLILGPHESK1PILLIQQPGKV 
TG EDRI^SWG SGWDVUjP KGNGMAFW I P FT YRGVR VGGLKB SA VH 
SQYKRSPNVPGDFPDCPAGMLFAEEQAKNLLEKYKRRPPAKRPN 
YVKLGTLAPFCCPWBQLTQDWESRVQAYEEPSVASSPNGKESDIj 
RRS E V P CAPM P KKTHQ PS DEVGTS 1 EHPREAE EVMBAGCQESAG 
PER I TDQHAS ENHVAATCSHLCVLRSRKLLKQLSAWCGPSS EDS 
RGGRRAPGRGQQGLTREACLSIIjGKFPRAIjVWVSLSLLSKGSPE 
PHTM I CVPAKEDFLQI*HEDWH YCGPQES KH5DP FR5 KI LKQKEK 
KKP^KRQKP\GRASSIX3PAGEEPVAGQEALTLGLW3GPLPRVTL 
HCSRTLLGFVTQGDFSMAVGCGEALGFVSLTGLLDMLSSQPAAQ 
RGLVLLRPPASLQYRPARIAIBV 




1781 


635 


ASDS 1 PWSEARTTRKLAQRGCQWSLPERMPLVVFCGLP YSGKSR 
RAEELRVALAAEGRAVYVVDnAAVLGAEDPAVTCDSAREKALRG 
AIiRASVERRLSRHDVVILDSLKYIKGFRYELY\CLARAARTPIjC 
L VYCVR PG GP I AG P QVAGANEN PGRNVS VS WRPRAEED GRAQAA 
GSSVLR^LHTADSVVNGSAQADVPKELEREBSQAAESPAIiVTPD 
S EKS AKHGSGAF YS PELLEALTLRFEAPDSRNRWDRPLFTLVGL 
EBPLPLAGIRSALFE^IRAPPPHQSTQSQPLASGSFLHQIjDQVTS 
QVLAGLMEAQKSAVPGDLLTLPGTTEHLRFTRPLTMAELSRLRR 
Q FI S YTKMHPNNBNIf PQLANM FLQYLSQSLH 


S437 


739 


1^72 


CQEAASEFGGPLHTPAMFLRRLGGWLPRPWGRRKPMRPDPPYPE 
PRRVDS SS ENSGSDWDSAPETMEDVGHPKTKDSGALRVSRAASE 
PSKEEPQVEQLGSKRMDSLKWDQPISSTQESGRLEAGGASPKLR 
WDHVDSGGTRRPGVSPEGGL\GVPGPGAPLEKPGRREKLLGW1*R 
GEPGAPSRYl/SGPEECMISTKLTIJILLELliASALIALCSRPLR 

CLFGLLQALVLAVS1*REPNGDEAATDWBSEGI*EREGEEQRGDPG 
KGL 


5438 


2443 


1152 


TKPRKRRHQPASQRQRPWSSDSTGDLLARGKGRKEENKGSDRVS 
LAPPSLRi?PMMCQSEARCX3PBljRAAKWLHFPQLALRRRlK3QIjSC 
MSRPAI*KI*RS W PLTVLYYLLPFGALR PLSRVG WRPVS RVAL YKS 
VPTRLLSRAWGRLNQVELPHWLRRPVYSLYIWTFGVNMKEAAVE 
DLHHYRNLSE FFRRKLKPQARP VOGLHS VI SPSDGR I LNFGQ VK 
NCEVEQVKGVTYSLES FLGPRMCTEDL PFP PAASCDS FKNQLVT 
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location 
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Amino acid segmenc containing signal peptide 
(An Alanine, CaCysteine, D=Aspartic Acid/ E» 
Glutamic Acid, Fo Phenyl alanine, G -Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L-Leucine, M«*Methionine, N=Asparagine, 
P=Proline, Q^Glut amine, R*=Arginine, 
S=Serine, T=Threonine, VaValine, 
w=»Tryptophan, Y^Tyrosine, X=unknown, *=Stop 
Codon # /cpossible nucleotide deletion, 
\=possible nucleotide insertion) 








REGNELYHCVIYLAPGDYHCFHSPTDWTVSHRRHFPGSLMSVNP 
GMARWIKELFCHNERWZiTGDWKHGFPSLTAVGATVNWGSIRIY 
FDRDLHTNSPRHS KGS YNDFSFVTHTNREGVPMALRGEHLG / OS 
FKLGSTIVXjIFEAPKDFNFQLKTGQKIRFGRALGSL 


5439 


2443 


1152 


TKPR KR RHQPAS QRQRP WS SDSTGDLLARGKGRKEENKGSDRVS 
LAP PS LRU PMMCQSEARQG PELRAAKWLHFPQLAI»RRRLGQ1»SC 
MSRPALKLRSWPLrVLYYLLPFGALRPLSRVGWRPVSRVALYKS 
V PTRLLS RAWGRUNTQ VEL PHWIjRRP VYS LY IWTFG VNMKEAAVE 
DLHHYRNLSEPFRRKLKPQARPVCGLHSVISPSDGRILNFGQVK 
NCEV^QVKGVTYSLESFIiGPRMCTEDLPFPPAASCDSFKNQLVT 
REGNELYHCVIYLAPGDYHCFHSPTDWTVSHRRHFPGSLMSVWP 
GMARW 1 KELFC31N3RWLTGDWKHGFPS LTAVGAT \NWGS I R 1 Y 
FDRDLHTNSPRHSKGSYNDFSFVTHTNRBGVPMALRGEHLG/QS 
FNLGSTIVLI FEAPKDFNFQLKTGQKIRFGEALGSb 


5440 




253 


BP I P VTPDHRIiVTMTHI V \QTFS PVNS \GQPPNYEMLKEEQEVA 
MIXSAPHNPAPPMS'TVIHIRSETSVPDHVVWSLFNTLFMNTCCI.G 

fiafaysvksrdrkmvgdvtgaqayastakclniwalilgifmt 

ItjLIIIPVLWQAQR 


5441 


2 


2054 


CRDGGKNGFM VSPMKPLEI KTQCSGPRMDPKICPADPAFFS FIN 
MSDLWANIETGEKRRLTFCHQGLSNVLDDPKSAGVATFVTQEE 
PDRFTGYWWCPTASWEGSEGLKTLRILYEEVDESBVEVIHVPSP 
ALEERKTDSYRYPRTGSKNPKIALKIaAEFQTDSQGKIVSTQEKE 
LVQPFSSLFPKVEYIARAGWTRDGKYAWAMFIiDRPQQWLQLVLL 
PPALFIPSTErreEO\RLASARAVPRNVQPYVVYEEVTNVWINVH 
DIFYPFPQSEGEDELCFIiRANECKTGFCHLYKVTAVLKSQGYDW 
SEPFSPGEGEQSLTNAlWVNEETKIiVYFQGTKDTPLEHHIiYWS 
YEAAGEIVRLTTPGFSHSCSMSQNFDMFVSHYSSVSTPPCVHVY 
KLSGPDDDPLHKQPRFWASMMEAAKIFHFHTRSDVRLYGMIYKP 
HALQPGKKHPTVLFVYGG PQVQLVNNS FKG I KYLRLNTLASLG Y 
AWVIDGRGSCQRGLRFEGALKNQMGQVEIEDQVEGLQFVAEKY 
GFIDLSRVAIHGWSYGGFLSLMGIilHKPQVFKVAIAGAPVTWM 
AYDTG YTERYWD VPENNQHGYEAGS VALHVEKLPNE PNRLLIIjH 
GFLDENVHFFHTNFLVSQLIRAGKPYQLQVAU>PVSPQIYPNER 
HS IRC PESGEHYEVTIiLHFLQEYL 


5443 


1 


3474 


CGQRSRRRSPDMPEAKPAAiOCAPKGKDAPKGA^KEAPPKEAPAE " 
APKEAP P EDQ S PTAEE PTGVFL KKPDSV3VETG KDAWVAKVNG 
KELPDKPTIKWFKQKWLELGSKSGARFSFKESHNSASNVYTVEL 
HIGKVVLGDRGYYRLEVKAKDTCDSCGFNIDVEAPRQDASGQSL 
ES FKKTSBKKSDTAGELDFSGLLKKREWE EE KKXKKKDDD DLG 
3 P PE IWEIjLKGAKKSEYEKIAFQYG ITDLRGMLKRLKKAKVEVK 
KS AAFl'KiCLDPAYQVDRGNKlKLMVZISDPDLTLKWFKNGQEI K 
PSS KWFENVGKKRILTINKCTLADDAAYEVAVKDEKCFTELFV 
KEPPVLIWPLEDQQVFVGDRVEMAVEVSEEGAQVMWMKDGVEIj 

tredsfkaryrfkkdgkrhilifsdvvqedrgryqvitnggqce 

AELIVEEKQIjEVLQDIADLIVKASEOAVFKCEVSDEKVTGKWYK 
NGVEVRPS KRITISHVGRFHKLVIDDVRPEDEGDYTPVPDGYAL 

gslsaklnflbikveyvpkq\eppkiplgfasggktsenad/iv 
vvagnklrldv\sitgeapspfat\wlkg\devftttegrtrie 
krvdcss fvies aqredegryttkvtnpigedvas iflqwdvp 
dppeavritsvgebwaiiivweppmydggkpvtgylverkkkgsq 

RWMKLNFEVFTETTYESTKMIEGILYEMRVFAVNAIGVSQPSMN 
TKPFMPIAPTS3PLHLIVEDVTDTT7TLKWRPPNRIGAGGIDGY 

TiVT!YrriRnfiEKWVPftT^I?DVRprY2i7TVir\rr unvsnDTT coTnrmrvr 
jjvcj x v.Li&uoEi£>nvr/uilArvaKV.Uri V IvlM Jj ek\xnji. X LirKVVliViN 

IAGRSEPATLAQP VT IRE I AEPP K I RL P RHLRQTYI RKVGEQLN 
LWP FQGKPR PQWWTKGGAPLDTS RVHVRTSDFDTVFFVRQAA 
RSDSGEYELSVQIBWMKDTATIRIRVVEKAGPPINVMVKEVWGT 
NALVWQAPKDDGireEIMGYFVQKADlOCTM^ C 
TVSDItIVGNEYYFRVYTENICX5LSDSPGVSKNTARILKTGITFK 
PPEYKEHDFRMAPOTiTPLIDRVVVAGYSAAIiNCAVRGHPKPKV 
VWMKNKMEIRED PKFL I TNYQGVLTLNIRRPS PFDAG TYTCRAV 
NELGEAXAECKLEVRVpQ 
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SEQ 
ID 
NO: 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
ijucieotivie 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
i AB/uanine , c=cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, Ielsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PaProline, Q=Glutamine, RoArginine, 
S=*Serine, T-Threonine, V=*Valine, 
W=Tryptophan, Y°Tyrosinc, X= Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\ -yuaoiuic uub^cDLxQc mosrcionj 


5443 


66 


1003 


SRGQLDAGQSSEQHGGNRQPEQSRSRSSSSSSSPRRSRSAAEPA~~ 
MALSMPLNGbKEEDKEPLIELFVKAGSDGESIGNCPFSQRLFMI 
LWL KGWFS VTTVDLKRKP ADLQNLAPGTHP PF ITFNS EVKTDV 
NKI EEPLBEVLCPPKYLKLS PKHPESNTAGMDI FAKPS AYIKNS 
RPEANEALERGLkKTLQKIiDE YLNS PLPDEI DENSMED I fCFSTR 
KFLIX3NEMTIADGNLLPKLH1VKVVAKKYRNFDIPKEM 
IiTNAYSRDEFTNTCPS DKEVE I \ AYSDVAKRLHQVKSRLLKE VS 
FMSSP 


5444 


2 


344 


SGP I G VTGAQMAKWIiRD YLS FGGRR PPPQP PTPDYTES DI LRA Y 
RAQKNLDFEDPY*DSESRLBPDPAGPGDSKNPGDAKYGSPKHRL 
1KVEAADMARAKALLGGPGEELEADTEYLDPFDAQPHPAPPDDG 
YME P YDAQWVMS EL PGRGVQL YDTP YEEQDP ETADG P P SGQKPR 
QSRMPQEDER PADE YDQP WEWKKDH I SRAFAVQFDS PE WERT PG 
SAKELRRPPPRSPQPABRVDPALPLEKQPWFHGPLNRADAESLL 
SLCKEGSYLVRLSETNPQDCSLSLRSSQGFLHLKFARTRENQW 
LGQHSGPFPSVPEpVIJnfSSRPLPV<^AEHLALLYPV\n , QTP*Q 
* PDWGDRRPNGQVATGLPELWGAEAPSAAAHPGLHRERHPEGLr 
RAEKPGLRGPLLGLREPLGAGPRGPWGLQEPRRCQVWFSQAPAH 
QGGGCGYGQSQGPSGRPRGGAGSRH j 


544G 


2364 


486 


ILSRGFliGSVEIClQLPIiPASBPVLL^TWARRRWRETRSRREPT 
TLRAQS VCP W W I * ETRMNRS I PVEVDESEP YPSQLLKP I PEYS P 
EBESEPPAPNIRNMAPNSLSAPTMLHNSSGDFSQAHSTLKtANH 
QRPV9RQVTCLRTQVLEDSEDSFCRRHPGLGKAFPSGCSAVSEP 
ASESWGALPAEHQFSFMEKRNQWLVSQLSAASPDTGHDSDKSD 
QSLPNASADSLGGSQEMVQRPQPHRNRAGLDLPTIDTGYDSQPQ 
DVLGIRQLBRPfcPLTSVCYPQDLPRPLRSREFPQFEPQRYPACA 
QMLP PNLSPHAP WNYHYHCPGS PDHQVP YGHDYPRAAYQQVIQP 
ALPGQPLPGAS VRGLHPVQKVI LNYPSPWDQEERPAQRDCS FPG 
LPRHQDQPHHQPPNRAGAPGESLECPAELRPQVPQPPSPAAVPR 
PPSNPPARGTLKTSNLPEELRKVFITYSMDTAMEWKFVNFLLV 
NGFQTAIDI FEDR IRG ID 1 1 KWMER YLRDKTVMI I VA IE PKXKQ 
DVEGAESQLDED3HGLHTKYIHRMMQIEF1KQGSMNFRPIPVLF 
PNAKKEHVPTWliQNTHVYS WPKNKKN ILLRLLREEBYVAPPRGP 
LPTLQWPL 


5446 


972 


161 


SSWSWCTGRMRKTRLWGLLWMLFVSELRAATKLTEEKYEIjKEGQ " 
TLDVKCDYTLEKFASSQKAWQI IRDGEMPKTLACTERPS KNSHP 

VQVGRI iledyhdhgllrvrmvnlqvedsglyqcviyqppkbph 

MLFDRI RLWTKGFSGTPGSNENS TQNVYXI P PTTTKALCPLYT 
TPRTVTQAPPKSTADVSTPDSBINLTNVTDI1RVPVFNXVILLA 
GGFLSKSLVFSVLFAVTLRSFVP*AHEPTRMSSDFQPHPSGSCA 
KQGGRR 


" S447 


207 


617 


MTARTLS LMAS L VAYDDS DS EAETEHAGS FNATGQQKDTS G VAR 
PPGODFASGTLDVPKAGAQPTKHGS CEDPGG YRLPLAQLGR S DR 
GSCPSQRLQWPGKEPQVTFP IKEPS CSSLMTSHVPASHMPLAAA 
RFKQ VKLSRNF P XS S FHAQS ES ETVG KNGS S FQKKKCEDCWPY 
TPRRLRQRQALS TETGKG KD VEPQGP PAGRAPAPL YVG PGVS EF 
IQPYLNSHYKETTVPRXVLFHLRGHRGPVNTIQWCPVLSKSHML 
LSTSMDKTPKVWNAVDSGHCI/QTYS LHTEAVRAARWAP CGRRI L 
SGGFDFALHLTDLETGTQLFSGRSDFRITTIiKFHPKDHNI FLCG 
GFSSEMKAWDIRTGKVMRS YKATIQQTIiDILFLREGSE FLSSTD 
ASTRDSADRT1 1 AWDFRTSAKISNQ I FHERFTCPS LALHP RE PV 
FLAQTNGNYLALFSTVWP YRMSRRRRYEGHKVEGYS VGCECS PG 
GDLLVTGS ADGRVIiM YSFRTAS RACTLQGHTQAC VGTTYH PV LP 
S VLATCS WGGDMKI WH * A FHWLSLG EA IG DLAP ARG YS G PGR S L 
KSPSPSKSLLVLLCGRAMFQPATCPWQLPALSK 


5448 


194 


1833 


MASKVTDAIVWYQKKIGAYDQQIWEKSVEQREIKGLRNKPKKTA 
HVKPDLIDVDIiVRGSAFAKAKPESPWTSLTTKGIVRWFFPFFF 
RWWLQVTS KVIFFWLLVLYLLQVAAI VLFCSTS S PHS I PLTEVI 
GPIWLMLtiLGTVHCQIVSTRTPKPPLSTGGKRRRKLRKAAHLEV 
HREGDGS STTDNTQEGAVQNHGTSTSHS VGTVFRDL WHAAFFIiS 
GSKKAKNS IDKS TETDNGYVS LDGKKTVXSGEDG IQNHBPQCET 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corxe spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
vrt-M.xctnj.ije/ L c v.yoteine, u^/vsparcic aciq, fc.=» 
Glutamic Acid, F= Phenylalanine , G»Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L*Leucine, [^Methionine, N=Asparagine , 
P=Proline, Q-Glutamine, R=Arginine, 
S-Serine, T«Threonine f V=Valine, 
{^Tryptophan, Y*Tyrosine, XsUnJcnown, *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRPEETAWNTCTLRNGPSKDTQRTI TNVSDBVS SEEGPETG YSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDSESARPESETEDVLWEDLLHCAECHSSCTSBTDVENHQINPC 

VLE1SGM IMNRVNSHI PGIG YQI FGNAVSLI LGLTPFVFRLSQA 
TDLBQLTAHSASELYVI AFGSNEDVT VLSMVI I SFWRVSLVWI 
FFFLLCVAERTYKQVGIM*TSEGVLRNRKSHHYKKHYPNEDAPK 
SGTSCS SRCSS SRQDSESARPES ETED VLWEDLLHCAECHSS CT 
SETDVENHQINPCVKKEYRDDPFHQSHLPWLHS8HPGLBKISAI 
VWEGNDCKKADMS VLEI SGM IMNRVNSHI PGIGYQIFGNAVSLI 
I/3LTP FVFRLS QATDLE QLTAHSAS EL Y VI AFGSNEDVI VLS MV 
1 15 FWRVSLVWI FFFLL CVAERTYKQVGIM 


5449 


194 


1833 


MASKVTDAlVWYQK^IGAYDQQIWEKSV^REIKJSUiiNf^PKKTA 
HVKP DL IDVDLVRGS AFAKAXPBS P WTS LTTKG I VRWFFP F PF 
RWWLQVTS KVT FFWLLVLYL I*QVAAIVLFCSTS S PHS I PLTEVT 
GPIWI^LLIXSTVHCQIVSTRTPKPPLSTGGKRRRKLRKAAHLEV 
HREGDGSSTTDNTQBGAVQNHGT3T3HS VGTVFRDIiWHAAFFtiS 
GSKKAKNS IDKSTETDNQ YVS LDGKKTVKSG ED G I QNHEPQCST 
IRPEETAWNTGTLRNGPS KDTQRTI TNVSDBVS SEEGPETG YS L 
RRHVDRTS EGVLRNRKSHH YKKHYPNBDAPKSGTS CSSRCS S SR 
QDSBSARPBSBTEDVLWEDLIjHCAECHS sctsbtdvenhqinpc * 
vkke yrddp fhqs hlp wl rsshpglb kisai vwegntockkadms 
VLEISGMIM^VNSHIPGIGYQIFGNAVSLILGLTPFVFRLSQA 
TDLEQLTAHSASELYVIAFGSNEDVIVLSMVII SFWRVSLVWI 
FFFLLCWU3RTYKQVGIM*TSEGVLRNRKSHHYKKHYPNEDA?K 
SGTSCSSRCSSSRQDSESARPESBTEDVLWEDLLHCAECHSSCT 
SETDVENHQl^PCVKKEYRDDPFHOSHLPWLHSSHPGliEKISAI 
VWEGNDCKKADMSVLEISGMIMNRVNSHIPGIGYQI FGNAVSLI 
LGLT P FVFRLS QAT DLE QLTAHSAS EL YVTAFGSNE DVT VLS MV 
I ISFWR VSLVWI FFFIiLCVAERTYKQVGIM 


5450 


B136 


1242 


GQQFAS FFG* NHPE VT VAMALTDIDLQ LQFSMSQ PEALLLLAAG 
PADHLLLQLYSGHLQVRLVLGQEELRLQTPAETLLSDSIPHTW 
LTWBG WATLSVDG FLNASSAVPGAPLEVPYGLFVGGTGTLGL P 
YLRGTSRPLRJGCLHAATLNGRSijLRPLTPDVHEGCAEEFSASDD 
VALGFSGPHSLAAFPAWGTQDEGTLEFTI*rrQSRQAPLAFQAGG 
RRGDFIYVDIFEGHLRAVVEKGQGTVLLHNSVPVADGQPHEVSV 
HINAHRLE IS VDQ YPTHTSNRGVLSYLEPRGSLLLGGLDAEASR 
HLQEHRLGLTPE ATNAS LLGCMEDLS VNG QRRGLREALLTRNMA 
AG CRLBEB EYEDDAYGHYB AFSTLAPBAWPAMEL PEPCVPEPGL 
PPVFANFTQLLTISPLWAEGGTAWLEWRHVQPTLDLMEAELRX 
SQVLFSVrRGAHYGELBLDII^QARKMFTLLDVVimroVRFIH^ 
GSEDTS DQL VLEVSVTARVPM PSCLRRGQTYLLP IQVNPVNDP? 
HIIFPHGSLMVILEHTQKPLGPEVFQAYDPDSACEGLTFQVLGT 
SSGLP VERRDQPGEPATEFS CRELEAGSLVYVHCGG PAQDLTFR 
VS DGLQ AS P PATL KWAI R PAI Q IHRS TGLRLAQG SAM P I LP AN 
LS VETNAVGQD VS VL FR VTGALQ FGELQ KHS TGG VEG AE WWATQ 
AFHQRDVE QGRVRYLSTDPQHHAYDTVENIiALEVQVGQE ILSNL 
SFPVTI QRATVVWljRLBPIiHTQNTCX2ETLTTAHLEATLEEAG PS 
PPTFH YEVVQAPRKGNLQLQGTRLSDGOjG FTQDD I QAGR VTYGA 
TARAS BAVEDTFR FRVTAPPYFS PLYTFP IHIGGDPDAPVLTKV 
LLWPBGGEG VLSADHL FVKSLNS AS YLYE VMERPRLGRLAWRG 
TQDKTTMVTS FTNEDLLRGRLVYQHDDS ETTEDD I PFVATRQGE 
SSGDMAWEEVRGVFRVAIQP VNDHAPVQT I SRI FHVARGGRRLL 
TTDDVAFS DADSGFADAQLVLTRKDLLFGS I VAVDEPTRP I YRF 
TQEDLR KRR VLFVHS GADRGW IQLO VSDGQHQATALLEVQAS EP 
YLRVANGSSLWPQ^GO^TIDTAVLHI^TNLDIRSGIIEVHYHVT 
AGPRWGQLVRAGQPATAFSQQDLLDGAVLYSHNGS LS P EDTMAF 
SVEAGPVHTDATLQVTIAl^PIAPLKLVRHKKIYVFOGEAAEI 
RRDQLEAAQEAVPPADI VFSVKS PPS AG YLVMVSRGALADEP PS 
LDPVQSFSQEAVDTGRVLYLHSRPEAWSDAFSLDVASGLGAPLE 
GVLVELEVLPAAI PLEAQNFS VPEGGSLTLAPPtiLRVSGPYFPT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Al anine . CaPvateinp n— bonavhj r» nr*i a c_ 
Glutamic Acid, F= Phenylalanine, G«Glycine, 
H«Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P-Proline, Q=Glutamine, R»Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tiyptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\cpossible nucleotide insertion) 








LLGLSLQVLEPPQHGPLQKEDGPQARTLSAFSWRMVEEQLIRW 
HDGSBTLTDSFVLMANASEMDRQSHPVAFTVTVLPVNDQPPILT 
TNTGLQMWBGATAP I PAEALRSTDGDSGS EDLVYTIEQPSNGRV 
VLRGAPGTEVRS FTQAQLDGGLVLFSHRGTLDGG PPFRLSDGEH 
TS PGHFFRVTAQKQVLLS LKGSQTLTVCPGS VQ P LSSQTLRASS 
SAGTD PQLLLYRWRG PQLGRLFHAQQDSTGEALVN FTQAEVYA 
GNI LYEHEMPPEPFWEAHDTLELQLS S PPARDVAATLAVAVS PE 
AACPQRPSHL WXMKGLWVPEGQRARI TVAALDASNLLASVPS PQ 
RSEHDVLFQVTQFPSRGQIiLVSEBPLHAGQPHFIjQSQriAAGQLV 
YAHGGGGTQQDG FHFRAHLQGPAGAS VAG PQTSEAFAITVRDVN 
ERPPQPQASVPLRLTRGSRAPISRAQLSWDPDSAPGEIEYEVQ 
RAPHNG FLS L VG GGIjG PVTRFTOADVDSGRLAFVANGS S VAG I F 
QLSMSDGASPPLPMSLAVD I LPSAIEVQLRAPLEVPQALGRS SL 

DQG B WFAFTNFS SSHDH FRVLALARGVNA5AVVNVTVRALLttV 
WAGGP WPQGATLRIiDPTVLDAGE LANRTGS VPRFRLLEGPRHGR 
WRVPRARTEPGGSQIiVEQFTQQDLEDGRLGLEVGRPEGRAPGP 
AGDS IiTLELWAQG VP PAVAS LDFATE P YNAAR PYS VALLS VP EA 
ARTEAGKPES S TPTGEPGPMASS PEPA VAKGGFLS FLEANMFS V 
1 1 PMCLVLLLZiAIi ILPLLF YLRKRNKTGKHDVQVLTAKPRNGLA 
GDTET FRKVE PGQAI PLTAVPGQG P P PGGQ P DPELLQFCRTPNP 
ALKNGQYWV 


stir" 


1 


2274 


RDSSEQGRTGDTLGRPSACMDALKPPCLWRNHERGKKDRDSCGR 
KNSEPGS PHSLEALRDAAPSQGLNFLLLPTKMLFI FNFLFSPLP 
TPALICILTFGAAIFLWLITRPQPVLPIiLDLNNQSVGIEGGAHK 
GVSQKNNDLTSCCFSDAKTMYEVFQRGliAVSDNGPCLGYRKPNQ 
PYRWLS YKQVSDRAEYLGSCLLHKGYKS SPDQFVG I FAQNRPEW 
I ISEIiACYTYSMVAVPIiYDTLGPKAl VH I VNKADIAMVI CDTPQ 
KALVLIGNVEKGFTPSLKVXILMDPFDDDLKQRGEKSGIEILSL 
YDAENLGKEHFRKPVPPS PEDZ^SVICFTSGTTGDPXGAMITHQN 
I VSHAAAPLKCVEHAYBPTPDDVAJS Yl* PLAHMFERIVQAVVYS 

AKTPLKKFUjKLAVSSKFKEIiQKGI I RHDSFWQKL I FAKI QDS L 
GGRVRVIVTGAAPMSTSVMTFFRAAMGCQVYEAYGQTECTGGCT 
FTLMDWTSGHVGVPLAC^YVKIjEDVADMNYFTVKNEGEVCI KG 
IWFKGYLKDPEKTQBAIjDSDGWIiHTGDIGRWLPNGTLKIIIDRK 
KNI FKLAQGEY1 APEKIEN1 YNRSQPVLQ I FVHGE 5 LRS SLVGV 

SGIiKTFEQVKAl FLHPEPFS IENGLLTPTLKAKRGELSKYFRTQ 
IDSLYEHIQD 


S452™ 


1933 


113 8 


SRVPShCLS LSLSLS P5REP VAGAPG CGTACPPAMAT L WGG LL R 
USSLLSIiSCLAIiSVLIjLAQLSDAAKNFEDVRCKCICPPYKENSG 
HI YNKN I SQKDCDCLHWE PMPVRGPDVEAYCLRCE CKYEERSS 
VTIKVTI 1 1 YLSILGLLLLYMVYLTLVEPILKRRLFGHAQL I QS 
DDDIGDHQPFANAHDVLARSRSRANVLNKVEYAQQRWKLQVQEQ 
RKSVFDRHWLS 


5453 


Ul 


1520 


PS IPAAVPQS APPE PHREETVTATATS QVAQQPPAAAAPGBQAV " 
AGPAPSTVPSSTS KDRP VS QP S LVGSKEE P P P ARS GSGGGSAKE 

KGLDTETTVEVAWCELQDRKLTKS ERQRFKEEAEMLKGLQHPN I 
VRFYDSWESTVKGKKCIVLVTEIiMTSGTLKTYLKRFKVMKIKVL 
RS WCRQI Ii KG WJPLfTTRTPP I IHRDtiKCDNI FI TGPTGS VKIGD 
LGLATLKRAS FAKSVIGTPEFMAPBMYEEKYDESVDVYAFGMCM 
LEMATS E Y P YSECQNAAQI YRR VTSGVKPAS FDKVAI PE VKE 1 1 
EGCTRQNKDERYSIKDIiZiNHAFFQEBTGVRVEIiAEEDDGEKIAI 
KLWLR1EDIKKLKGKYKDNEAIEFSFDLERNVPEDVAQEMVESG 
YVCEGDHKTMAKAI KDRVSL I KRKREQRQL* 


5454 


111 


1S20 


PSIPAAVPQSAPPBPHRJSETVTATATSQVAQQPPAAAAPGEQAV 
AGPAPSTVPSSTS KDRP VSQPSL.VGSKEEPPPARSGSGGGSAKE 
PQBERSQQQDDIEEIiETKAVGMSNDGRFLKFDIEIGRGSFKTVY 
KGLOTETTVEVAWCEliQDRKLTKSERQRFKEEAEMLKGIiQHPNI 
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SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

cor re apondiug 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
ih-Aicinine, LaLyBteine , D=Aspartic AcjLd, B= 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H=Histidine, It-Isoleucine, K=Lyeine, 
L*Leucine, ^-Methionine , N=Asparagine , 
P«Proline, Q«Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WaTryptophan, Y^Tyrosine, X=UnJcnown, *=Stop 

Cnrion /stlOQqihl p mini ttr\f 1 ^olot-inn 
WUUU; / B ^V99*UACS lIULlCULlOe QBiscion, 

\*possible nucleotide insertion) 








VRFYDSWESTVKGKKCIVLVTELMTSQTLKTYLKRFKVMKIKVL 
RSWCRQILKGLQFLHTRTPPIIHRDLKCDNIFITGPTGSVK1GD 
.LOLATLKRASFAKSVIGTPEFMAPEMYBEKYDESVDVYAFGMCM 
LEMATS E YP YS ECQNAAQ I YRRVTSGVKPAS FDKVAI P EVKEI I 
EGCIRQNXDER YS I KDLLNHAFFQEETG VR VELAEEDDGEKIAI 
KLWLRIEDI KKLKGKYKDNEAIEFSPDLERNVPEDVAQEMVESG 


- 5455 


1359 


377 


LTMVS PATRKS LPKVKAMDFITSTAI LPLLFG CL3V FGLFRULQ 
WVRGKAYLRNAVWITGATSGLGKECAKVFYAAGAKLVLCGRNG 
GALE E LI RE LTAS HATKVQTHKP YLVTFDLTDS GA I VAAAAE I L 
• QC FG YVD I LVNNAG I S YRGTIMDTTVDVDKRVME TNYFG PVALT 
KALLPSM I JCRROGHI VAISSIQGKMS IPPRSA YAAS KHATQAFF 
DCLRAEMEQYE I E VTV I S PGYIHTNLS VNAI TADGSRYG VMDTT 
TAQGRS P VEVAQD VLAAVGKKKKD VT LADLL PS LAVYLRTLAPG 
LFFS LMASRARKERKS KNS 


5456 


2 


2332 


C6AGL VAAG AVL VLY PASRAGE RTR V?3S PAPS S LPLHS PGACG 
TEVDMDPQRSPLLEVKGNIELKRPLIXAPSQLPLSGSRLKRRPD 
QMEDGLEPEKKRTRGLGATTKITTSHPRVPSLTTVPQTQGQTTA 
OKVSKKTGPRCSTAIATGLXNQKPVPAVPVQKSGTSGVPPMAGG 
KKPSKRPAWDLKGQLCDLNAELKRCRERTQTLDQENQOLQDQLR 
IlRQQQVXAIX5TERrTLEGHIiAKV0AQAEQGQQELKNLRACr^EL 
EBRLSTQEGLVQELQKKQVELQEERRGLMSQLEEKERRLQTS EA 
ALSSSQAEVASLRQETVAQAALiLTEREERLHGLEMERRRLHNQL 
QELKGNIRVPCRVRPVLPGEPTPPPGLLLFPSGPGGPSDPPTRL 
SLSRSDERRGTLSGAPAP PTRHDFS FDRVFPPGSGQDEVFEE I A 
MLVQSALDGY PVCIFAYGQTGSGKTFTMEGGPGGDPQLEGL I PR 
ALRMLFS VAQ ELSGQGWTYS FVAS YV£ I YNETVRDLLATGTR KG 
QGGECEIRRAGPGSEELTVTNARWPVSCEKEVBALLHLARQNJR 
AVARTAQNERS SRSHSVFQLQ I SGEHS SRGLQCGAPLS LVDLAG 
SERLDPGLALGPGERERLRETQAINSSLSTLGLVIMALSNKESH 
VP YRNS KLTYLLQMSLGGSAKMLMFVN I S PLBENVS ESLNS LRF 
ASKVEPSVLTOTAQSNRKVJKTDPDLCVCVCVCVCVCVCVCVCVP 
MSMYRVRGGRVAGGCPIGWRAPCPRAIK 


5457 


2 


1540 


DDFVERRRWTRTT CLVRSPPHVPVCGHACSWNGGSLDP LKGTPA 
LLRSAERLMRKVKKLRIiDKENTGSWRS FSLNSEGAERMATTGTP 
TADRGDAAATDDPAARPQVQ KHS WDGLRS I IHGSRXYSGLI VNK 
APHDFQFVQKTDESGPHSHRLY YLGMP YGSRENSLL YS E I PKKV 
RKEALLLLS WKQMLDHFQATPHHG VYSR EEELLRER KRLGVFG I 
TSYDFHSESGLFLFQASNSLFHCRDGGKNGFMVSPGPGCVSPMK 
PLEI KTQCSGPRMDPiCICPADPAFFS FINNS DL WVANIETGE ER 
RLTFCHQGLSNVLDDPKSAGVATFVIQEEFDRFTGYWWCPTASW 
EGSEGLKTLRI LYEEVDES EVEVIHVPSPALEERKTDS YRYPRT 
GSKNPKIALKLAEFQTDSQGKIVSTQEKELVQP FS SLPPKVEYI 
ARAGWTRDGKYAWAMFLDRPQQWLQLVLLPPALFI PSTENEEQA 
ASLCQS CPQECPAVCGVRGGHQRLDQCS 


5458 


6642 


4022 


F VPGLRE PQW B PAQPS ATMSAPS EEE E YARLVMEAQ P E WLRAE V 
KRLSHELAETTREKI Q AAE YGLAVLEE KHQ LKLQ FE EL EVD YE A 
IRSEMEQLKEAFGQAHTNHKKVAADGESREESLIQESASKEQYY 
VRKVLELQTELKQLRNVLTNTQS ENERLAS VAQELKE I NQNVE i 
Q RGRL RD D I KEY KFRE ARL LQD YS EI>E EENI SLQKQ VS VLRQNQ 
VE PEGLKHE IKRLEEETEYLNSQLEDAI RLKEIS ERQLEEAL ET 
LKTEREQKNSLRKELSHYMSINDSFYTSHLHVSLDGLKFSDDAA 
EPNNDAEALVNGFEHGGLAKLPLDNKTSTPKKEGLAPPSPSLVS 
DLLS ELN I SE IQ KLKQQLMQMEREKAGLLATLQDTQKQLEHTRG 
S LSEQQEKVTRLT EN LS AL R RLQAS KERQTALDN E KDRDSHEDG 
DYYEVD I NG P E ILAC KYHVAVAEAGELRE Q LKALRS THEAREAQ 
HABEKGR YEAEGQALTEKVS LLEKASR QDR ELLARLEKELKKVS 
DVAGETQGSLSVAQDELVTFSEELANLYHHVCMCNNETPNRVML 
DYYREGQGGAGRTSPGGRTS PEARGRRS P I LL PKGLLAPEAGRA 
DGGTG DS S PS PGS SLPS PLSDP RRE PKNI YNLI A 1 1 RDQI KHLQ 
AAVDRTTELS RQRIASQELGPAVDKDKEALMEE ILKL KSLLSTK 
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SEQ 
ID 
NO; 


| Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
( A= Ala nine » C«=Cvst eine D=A«?T}»r-t-i r» nr»4H w_ 

Glutamic Acid, F=Phenylalanine, GeGlycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=*Methionine, NoAsparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S^Serine, T-Threonine, V-Valine, 
W=Tryptophan. Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








REQ ITTI^TVIjKANKQTAEVAIJ^IjKSKYENSKAM^ETMMKLR 
NE LKALKEDAATPSS LRAMPATRCDEY ITQLDEMQRQLAAAEDE 
KKTLNSLLRMAIQQKLALTQRLELLELDHEQTRRGRAKAAPKTK 
PATPS VSHTCACAS DRA EGTGLANQ VPCS EKH5 IYCD 


5459 


316 


1262 


RGGHRLSGMASNFNDIVKQGYVRIRSRRLGIYQRCWLVFKKASS " 
KGPKRLEKFSDERAAYFRCYHKVTEIjNNVKIWARLPKSTKKHAI 
G I YFNDDTS KTFACESDLEADEWCKVLQMBCVGTR IND I S LGE P 
DLLATGVEREQSERFNVYLMPSPNLGCYMGECALQITYEYICLW 
DVQNPR VJCL IS WPLSALRR YGRDTTWFTFEAGRMCETGEGLFIP 
QTRDGEAIYQKVHSAALAIAEQHERLLQSVKNSMLQMKMSERAA 
SLS TM VPLPRS AYWQHI TRQHS TGQLYRLQD V5 SPL KLHRTETF 
PAYRSEH 


5460 


45 


2097 


rpgcragemtgsrarervrnrvsapcgqdsrrcdpevlrgrSp 

GLGLAEMPS CGACTCGAAAVRLI TS S LAS AQRG I SGGR I HMS VL 
GRLGTFETQILQRAPLRS FTETPAYFAS KDG I S KDGSGDGNKKS 
ASEGSSKKSGSGNSGKGGNQLRCPKCGDLCTHVETFVSSTRFVK 
CEKCHHFFVVLSEADSKKSIIKEPESAAEAVKIiAFQQKPPPPPK 
KIYNYLDKYWGOSFAKXVLSVAVYNHYKRIYNNIPANLRQQAE 
VBKQTSLTPRELEIRRREDEYRFTKHjQIAGISPHGNALGASMQ 
WW vi* WW x FWa KJiuaiab VltDS SUDD I KLaKoNI LLLGPTGSGKTTr Ti 
AQTLAXCLDVPFAICDCTTLiTQAGYVGEDI BSVIAKLLQDAHYN 
VEKAQQGIVFLDEVDKIGSVPG1HQLRDVGGEGVQQGLLKLLEG 
TIVWPEKNSRKI^GBTVQVDTraiLFVASGAFNGLDRIISRRK 
NEKYLGFGTPSNLGKGRRAAAAADLANRSGESNTHQDIEEKDRL 
LRHVEARDLIEFGMIPEFVGRLPVWPLHSLDEKTLVQIIjTBPR 
NAV I PQ Y QAL FS M DKCE IjNVTEDAIiKAIARIiALERKTGARGLiRS 
IMBKLLLBPMFEV-PNSDIVCVEVDKEVVEGKKEPGYIRAPTKES 
SE E E YD S GVE E EGW PRQ ADAANS 


5451 


1481 


160 


' INPPPPPKSPCGRARKWRRi^RPGAPEAAVMELP^GPGPERLFD 
bHKX»f t» u lii*l» VIjUjYAP vgfcllvlrlflg ihvflvs calpd 
S VLRRF WRTM CAVLGLVARQEDSGLRDHS VRVL ISNHVTPFDH 
NXVNLLTTCS T PLLNSP PS FVCWSRGFMEMNGRGEbVESLKRFC 
ASTRLPPTPLLLFPEEEATNGREGLLRFSSWPFS3;QDWQPLTL 
QVQRPLVSVTVSDASWVSELLWSLFVPFTVYQVRWLRPVHROIjG 
EANBEFALRVMLVAKELGQTGTRLTPADKAEHMKRQRHPRLRP 
QSAQSS FPPSPqPS PDVQLATLAQRVKEVLPHVPLGVTQRDLiAK 
TGCVDLTITNLLEGAVAFMPEDITKGTQSLPTASASKFPSSGPV 
TPQPTALTFAKSSWARQESLQERKQALYBYARRRFTERRAQEAD 


" 5462 




33*3 


KIKERQMSANKS P PSAQKSVLPTAI PAVLPAAS P CSSPKTG LS A 
RLSNGS FS APSLTNSRGS VHTVSFLLQIGLTRE S VT I EAQELS L 
SAVKDLVCSIVYQKFPECGFFG^nfDKII^RHDMNSENILQLIT 
SADEIHEGDLVEVVLSAIiATVEPFQIR^HTLYVHSYKAPTFCDY 
CGEMLWGLVRO^LKCBGCGLNYHKRCAFJCXPNNCSGVRKRRLSW 
VSLPGPGLSVPRPLQPEYVALPSEESHVHQBPSKRIPSWSGRPI 
WMEKMVMCR VKVPHTFAVHS YTRPT I CQ YCKRLLKGLFRQGMQ C 
KDCKFN CHKRCASKVPRDCLGEVTFNGE PSSLGTDTD IPMD IDN 
NDINSDSSRGLDDTEEPSPPEDKMFFLDPSDIiDVBRDBBAVKTI 
SPSTSNNIPI^RWQSIXHTKRKSSTMVKEGWmiYTSRDNIiRK 
RH YWRLDS KCLTLFQNESGS KYYKE I PLSEILRISS PRDFTNI S 

QGSNPHCFEIITDTMVYFVGENNGDSSHNrPVLAATGVGLDVAQS 

WRIOVTROA1M PUT Dn a GUPT c TsnpvnxmuvTiT.eT o t owe?xinr\ T nn 
BisftftiRVftimrv x. i'yrtc* vuiC) r'O'^O rSJJrllUJiaia 1 £j lavbNCQlQE 

NVDISTVYQIFADE VLGSGQ FGI VYGGKHRfCTGRD VA IKVIDICM 
RFPTKQESQLRNEVAILQNLHHPGIVNLECMFETPERVFWMEK 
LHGDWLEMILSSEKSRIiPERITKFMVTQILVALRNLHFKNIVHC 
DLKPENVLIiASAEPFPQVKLCDFGFARl IGEKSFRRSWGTPAY 
LAPEVLRSKGYNRSLDMWSVGVItYVSLSGTFPFNEDEDINDQI 
ONAAFM YP PNPWRE I SGEAI DLIWNLLQ VKMRKR YS VDKSLSH P 
WLQDYQTWLDLREFETRIGERYITHESDDARWEIHAYTHNLVYP 
KHPIMAPNPDDMEBDP 


5462 ~ 


237 


1012 


LLSVTMTTSRCSHLPEVLPDCTSSAAPVVKTVEDCGSLVNGQPQ 
YVMQVSAKDGQLLSTWRTIiATQS P FNDR PM CRI CHEGS S QEDL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 1 
(A»Alanine, C=Cvb teine, DsAsDartic Acid E- 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I*Isoleucine, K«Lysina, 
L=Leucine, M=Methionine, N-Asparagine , 
Pa Proline, Q=Glut amine, R=Arginine, 
S=Serine, T«Threonine, V-Valine, 
W=>Tryptophan, Y-Tyrosine, X» Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) \ 








lspcectgtlgtihrsci^hwlsssntsycblchprfaverkprH 
plvewlrnpgpqhe jcrtlfgdmvcfl f 2 tplat isg w l clrg av 

DHLHFSSRLEAVGLIALTVALFTiyiFWTIiVSFRYHCRLYNEVm 
RTNQRVILLIPKSVNVPSNQPSLLGLHSVKRNSKETW | 




195 


677 


SPSMNPRKKVDLKLirVGAlGVGKTSLlliHQYVHKTFyEEYQTTIj 
GASILSKI I ItiGDTTLKIiQl WDTGGQERVRSMVSTFYRGSDGCI 
LAFDVTDLESPEALDIWRGDVLAKIVPMEQSYPMVLLGNKIDLA 
DRKYQSILENHLTES IKLSPDQSRSRCC | 


5465 


5278 


3348 


KGDPREFIR^REALECDYVSAHLHEWIDLIFGYKQQGPAAVBA 

VNVFHHLPYEGQVDIYNINDPLKETATIGFINNFGQIPKQLFKK 

PHPPK^VRSRLNGDNAGISVLPGSTSDKIFFHHLDNLRPSLTPV 

XELXEPVGQIVCTDKGILAVEQNXVZiI PPTWNKTFAWGYADLSC 

RLGTYESDKAMTVYECLSEWGQILCAICPNPKLVITGGTSTWC 

WEMGTSKEKAKTVTLKQALLGHTDTVTCATASLAYHIIVSGSR 

DRTCIlWDLNKLSFLTOIjRGHILJVP\/t;2VT.r , TlTOT.TWriTVen&r'n»v 1 

IH VWS INGNP I VS VNTFTGRSQQI I CCCMSEMNEWDTQNVI VTG 

HSDGWRFWRMEFLQVPETPAPEPAEVLEMQEDCPEAQIGQEAQ 

DEDSSDSEADEOSISQDPKDTPSQPSSTSHRPRAASCRATAAKC 

TDSGSDDSRRWSDQLSLDBKDGFIFVNYSEGQTRAHLQGPLSHP 

HPNPIEVRNY3RLKPGYRWERQLVFRSKLTMHTAFDRKDNAHPA 

EVTALGI S KDHS RI L VGDS RGR VFS WSVSDQPGRSAADHW VKDE 

GGDSCSGCS VRFSLTERRHHCRNCGQLFCQKCSRFQSE I KRLKI 

S 3 P VR VCONCY YNLOHERGSE DGPRMP I 


546* " 


3 ■ 


992 


HACWttSAHASGP^VRWWJUCPJlSVMGIQTSPV^^ 
I^LAVGSYIiVRRSRRPQVTLLDPNEKYLLRLLDXTTVSHNTKRF 
RFALPTAHKTLGLPVGKHIYLSTRIDGSLVIRPYTPVTSDEDQG 
YVD LVI KVYLKG VHP KFP EGGKMSQ YLDSIiKVGD WE FRGPSGL 
LTYTG KGH FN IQPNKKS P P EPRVAKKLGM1AGGTGITPMLQLIR 
AILKVPED PTQC FLLFANQTBKD 1 1 LREDLEELQARYPNRFKLW 
FTLDHP PKD WA YSKG FVTADMI R EHLP APGDDVLVLLCG P PPMV 
QLACHPNLDKLGYSQKMRFTY J 


5467 


2103 * 


4 


GEALRVGTRGCRRDLPD PQARI F IQKKDLEEJDES VTAAHLKSRG 
RS PR K I DQFCNS SNMVHGS VTFRDVAI DFS QEBWECLQPDQRTL 
YRDVMLENYSHLISLAGSSXSXPDVITLLEQBKEPWWVRKETS 
RRYPDLELKYGPEKVSPENDTSEVNLPKQVIKQISTTLGIEAFY 
FRNDSEYRQFEGLQGYQEGNINQKM I S YEKLPTHTPHASLICNT 
HKPYECKECGKYFSCGSNLIQHQS IHTGFJCPYKCKECGKAFQZiH 
IQLTRHQKFHTGEKTFECKECGKAFNLPTQLNPJIXNIHTVKXLF 
ECKECGKSFNR5SNLTOHOv^THAnVTrPYry^^rav , aT?\TOr2CT<rr t 1 

QHQKIHSNEKPFVCKBCGMAFRYHYQLIEHCQIHTGEKPFECKE 
CGKAFTLLTKLVRHQKIHTGEKPFECRECGKAFSLIiNQLNRHKN 
IHTGEKP FECKECGKSFNRSSNLVQHQSIHAGI KPYECKECGKG 
FNRGAHLIQHQKIHSNEKPFVCRECEMAFRYHCQIjIEHSRIHTG 
DKPF BCQDCGKAFNRGSS LVQHQS IHTGEKPYECKE GGKAFRLY 
LQLSQHQKTHTGEKPFECKECGKFFRRGSNUJQHRS IHTGKKP ? 
ECKECGKAFRLH^LIPJiQKLHTGBKPFBCKECaKAFRLIlMQl,I 
RHQKLHTGEKPFECKECGKVFSLPTQLNRHKNIHTGEKA3 j 


546B 


225 


2976 


S FLTDL FQ S LAQLENLC KQLY ETTDTT TRIjQ AE XAL VE FTNS PD 
CLSKCQLLLERGSSSYSQLLAATCLTKLVSRTONPLPLEQR1DI f 
RNYVLNYLATR PKLATFVTQALIQLYAR I TKLGWFDCQKDDYVF 
RNA J.TDVTRFLQDSVE YCI IGVTILS QLTNEINQVSATAPL I EA 
DTTHPLTKHRKIASSFRDSSLFDIFTLSCNLLKQASGKNLNLND 
ESQHGLLMQLLKLTHNCLNFDFIGT5 TDESSDDLCTVQ I PTS WR 
SAFLDSSTLQLSTIGRCEYEKTCALLVQIiFDQSAQSYQELIiQSA 
SASPMDIAVQEGRLTWLVYIIGAVIGGRVSFASTDEQDAMDGB1, 
VCRVLQLMNLTDSRIAQAGNBKLELAMLSFFEQFRKIYIGDQVQ 
KSSKLYRRLSEVLGLNDETMVLSVFIGKI ITtfLKYWGRCEPITS 
KTLQLLNDLS IG YS SVRKLVKLSAVQFMLNNHTSEHFS FtiGINN 
QS^TDMRCRTTFYTALGRIJ.MVDLGEDEDQYEQFMLPIi'FAAFE 
AVAQMFS7NSFNEQEAKRTLVGLVRDLRG IAFAFNAKTSPMMLF 
EWIYPSYMPILQRAIELWYHDPACTTPVLKU^AELVHNRSQRLQ | 
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SEQ 
ID 

ITO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end" 
nucleat* id^ 

location 
corresponding 
to first 
amino acid 
reoidue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AtsAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=s Phenylalanine , G*Glycine, 
HoHistidine, I=Isoleucine, K«Lysine, 1 
L=Leucine, M«Methionine, NoAsparagine , 1 
P=Proline, Q«Glutaraine, R=»Arginine, \ 
S»Serine, TeThreonine , V=Valine, 
W=Tryptophan, YoTyrosine, X=Unknown, *=Stop | 
v - uuulJ i / -possix>j.e nucleotide deletion, j 
\=possible nucleotide insertion) | 








FDVSSPNGiLL^TSKMiTMYGNRltTIX3EVPKDQ\nfALianCG| 
IS I CPSMLKAALSGSYVNFGVFRLYGDDALDNAIjQTFI KLLXjS I 
PHSDLLD YP KLSQS YYSLLEVLTQDHMNFIASLEPHVI M Y I LS S 
ISEGI/TAIiDTM VCTGCCS CLDHI VTYLFKQLSR5 TKKRTTPLNQ 
ESDRFI^IMQQHPEMIQQMLSTVLNIIIFEDCRNQWSMSRPLLG 
LILLNEKYFSDLRNS TVNSQPPEKQQAMHLCFENLMEG I BRNLL 
TKNKDRFTQNLSAFRREVNDSMKNSTYGVNSNDMMS i 


5469 


134 


2653 


DQE FETS LVP WHI#PMG WL CSGLLFP VSCLVLLQVASSGNMKVLQ 
BPTCVSDYMS ISTCEWKMNG PTNCSTELRLLYQLVFLLSEAHTC 
VPENNGGAGCVCHLLMDDVVSAjDNYTLDLWAGQQLLWKGSFKPS 
EHVKPRAPGNLTVHTNVSDTLLLT17S.VPYPPDNYLYNHI»TYAVN 
IWSENDPADFRIYWTYLEPSIiRlAASTLKSGISYRARVRAWAQ 
CYNTTWS E WS PSTKWHNS YREPFEQHLLLGVS VS CIVI LAVCLL 
CYVS I TKIKKEWWDQI PNPARS RLVAII IQDAQGSQWE KRS RGQ 
EPAKCPHWKNCLTKLLPCFLEHNMKRDEDPHXAAKEMPFQGSGK 
S AW C P VE I S KTVLWPES I SWRCVELFEAP VECE3 EEEVEEE KG 
SFCASPESSRDDFQEGREGIVARLTESIjFLDLLGEENGGFCQQD 
MGES CLLP PSGS TS AHMPWDEF PS AG P KEAPP WGKEQ PLHL E PS 
PPASPTQSPDNLTCTETPLVIAGNPAYRSPSNSLSQSPCPRELG 
PDPIaIiAR1JLEEVEPEMPCVPQLSEPTTVPQPEPETWEQII,RR>JV 
IjQHGAAAAP VS A PTS G YQE FVHA VE QGGTQASA WGLG P PG EAG 

YKAFSSUASSAVSPEKCGFGASSGBEGYKPFQDLIPGCPGDPA 

pvpvplftfgldrepprspqsshlpssspehlglepgekvedmp 
kpplpqeqatdplvdsi^sgivysaltchlcgiilkqchgqedgg 
cyrpvmaspccgcccgdrasppttplrapdpspggvpleaslcpa 
slapsgisekskssssfhpapgnaqsssqtpkivnfvsvgptym 

RVS ~ J 


547D 


17 


1418 


TACRIRTSIiNRGIAAVKKDAVEMLASYGIoAYSLMKFPltapMSljF 1 

knvglvfvns krdrtkavlcmwagaiaavfhtl i aysdlg y yi 
inklhhvdesvgsktrraflylaafpfmdamawthagillkhky 
sflvgcasisdviaqwfvaillhshlecrepllipilslymga 
lvrcttlclgyyknihdi i pdrsgp elggdati rkmlsfwwpiia 
lilatqrisrp i vljlifvs rdlgg s saateavai ltatyfvghm p 

YGWLTE I RAVYP AFDKNNPSNKJjVSTSNTVTAAH I KKFTFVCMA 
LSLTLCFVMFVfTPNVSEXI L IDI IGVDFAFAELCWPLR I FS FF 
P VP VTVRAHLTGWLMTLKKT FVLAP S S VLRI IVLIAS LWLPYL 
GVHGATLGVGS LLAG FVGE STMDAI AACYVYRKQ KXKMENE SAT 
EGEDSAMTDMPPTEEVTDIVBMREENE 


5471 ~ " 
~5472 


1B?8 


£58 


RSSAPPGPQRAAAATAAAAAAGVEMAAAAAQGGGGGEPRRTEGV | 
GPG VPGE VEMVKGQ PFDVG PRYTQLQY I GEGAYGMVSSAYDHVR 
KTRVAIKKISPFEHQTYCQRTLREIQILLRFRHENVIGIRDILR 
AS TLEAMRDVY I VQDLMETDL YKLLKS QQLSNDH I CYFIiYQ I LR 
GLK Y I HSANVXHR DLKPSNLL I NTTCDLKI CDFGLARI ADPBHD 
HTGFLTE YVATRWYRAPEI MLNSKGYTKS I DIWS VGCILAEMIjS 
NRPIFPGKHYI*DQLNHII/3ILGSPSQBDLNCIINMKAR1IYLQSL 
ffa K. I avawaiujFPKSuS KALDIiLDRMLT FNPNKR I TVEEALAHP I 
YLEQY YDPTDEPVAEEPFTFAMELDDLPKERIjKELI fqetarfq 
PGVLBAP 1 




1469 


753 


LYVMARYLSDJSK VAVSIDRLCKANGRSPS IPFGTVRI PGRARVRj 

dpqalwifgygslvwrpdfaysdsrvgfvrgysrrfwqgdtfhr 
gsdkmpgrvvtlledhegctwgvayqvqgeqvskalkylnvrea 

VbGGYDTKEVTFYPQDAPDQPLKALAYVATPQNPGYLGPAPEEA 
IATQILACRGFSGHNLEYLIiRVRDVMQLGGPQAQDEHIiAAIVDA 
VGTMLPCFCPTEQALALV 


54 73 


3 


2119 


FMNVKLLIQDIiEDIEQRVPVMDAQYKIITKTAHI»ITKESPQEBG j 

KEMFATMSKLKEQLTKVKECYSPLLYESQQLLIPLBELEXQMTS 

FYDSI/3KINEIITVLEREAQSSALFKQKHQELLACQENCKKTLT 

LIE KGSQS VQKFVTLSNVLKHFDQTRLQRQ IAD IHVAFQSMVKK 

rGD^KHVBTNSRLMKKFEESRAELEKVLRIAQEGIiEEKGDPEE 

LuRRHTEFFSQLDQRVLNAFLKACDELTDILPBQEQQGLQEAVR 

^HKQWKDLQGEAPYHLLHLKIDVEKNRFLASAEECRTELDRET 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cystaine, D=Aapartic Acid, E=» 
Glutamic Acid, F=» Phenyl alanine, G*Glycine, 
H=Histidine, I«Isoleucine, K-Lysine, 
L=Leucine, Methionine, N-Asparagine, 
P=.Proline, Q-Glutamine, R=Arginine, 
Sofierine, ^Threonine, V= Valine, 
{^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KLMPQEGSEK* IKEHRVFFSDKGPHHLCBKRLQLIEBLCVKLPV - 

nnovprvpor'Tout j*t*t vpt — — — — — — — — 

nifevxujri ^l^H\rrhKELRAAjDSTY2.KLMEDPDKV3KDYTSRFS 

EFS S W I STNETQLKGI KG BAIDTANHGEVKRAVE BIRNGVTKRG 
ETLS WL KSRLKVLTEVS S ENEAQXQGDELAKLS S S FXALVTLLS 
EVE KMLSNFGDCVQ YKEI VKNSLEELI SGSKBVQEQAE KZ LDTE 
NL FEAQQLLLHHQQKTKR I SAKKRDVQQQ IAQAQQGEGGLPDRG 
H E E LR KLE S TLDGLERS RE RQE RR I Q VTLRKWERFETNKETVVR 
x i* i 5> bHERFI*S FSSIjESLSSELEQTKEFS KRTES I AVQAEN 
LVKEAS EIPLGPQNKQLLQQQAKS I KEQVKKLEDTLEBE YVI DK 
S 


5474 


2 


780 


TPDVRQLQASRRGIAVASWCSPRWFAGEEMAFVKS3WLLRQSTI " 
LKRWKKNW FDLWSDGHL I YYDDQTRQN I EDKVHM PMDC IN I R1X3 

QECRDTQPPDGKSKDCMLQIVCSIDGKTISLCAESTDDCLAWKFT 
LQDSRTNTAY VG S AVMTDETS WSS PP P YTAYAAPAPE VGRTLS 
LQQAYGYGPYGGAYPPGTQWYAANGQAYAVPYQYPYAGLYGQQ 
PANQVI IRERYRDNDSDIiALGMIiAGAATGMALGSLFWVF 


547S~ 


2 


506 


ARGWLESLSLTCQTTPPPSSPail^SPETFIHTMPPNLTGYYRF 
VSQKNMEDYLQALNlSlAVRKIALLIiKPDKE I EHQGNHMTVRTL 
STFRNYTVQFDVGVEFEEDLRSVDGRKCQTIVTWEEEHLVCVQK 
GEVPNRGWRHWLEGEMLYLKLTARDAVCEQVFRKVR 


5476 


192 


1457 


5DSMSLLDCFCTSRTQVESLRPEKQSETSIHQYLVDEPTLSWSR 
PSTRAS EVLCSTNVSHYBLQVEIGRGFDNLTSVHLARHTPTGTL 
VTIKlTNLENCNEERliKALQKAVILSHFFRHPNlTTYWTVFTVG 
S WLWVI S PFMAYGSASQLLRTYFPEGMS ETLIRNI ijFGAVRGLN 
YLHQNGCIHRS I KASHILI SGDGLVTLS GLSHLHS L VXHGQRHR 
AVYDFPQFSTSVQPWLSPELLRQDliHGYNVKSDIYSVGITACEL 
ASGQVPFQDMHRTQMLLQKLKGPPYSPLDISIFPQSESRMKNSQ 
SGVDSG IGESVLVS SGTHTVNSDRLHTPSSKTFS PAFFSLVQLC 
LQQDPEKRPSASSHiSHVFFKQMKEESQDSILSLLPPAYNKPSI 
SliPPVLPWTEPECDFPDEKDSYWBF 


5477 




1044 

»■ 


RGNSRLRYSHEDELQLPRLPELFBTGRQLLDEVEVATEPAGSRI 
VQEKVFKGLDLLBKAAEMLSQLDLFSRNBDIjEEIASTDLKYIjLV 
PAFQGALTMKQVNPS KRLDHLQRAREHF INYLTQCHCYHVAEFE ? 
LPKTMNNS AENHTANSSMAYPS LVAMAS QRQAKI QR YKQKKELB 
KRLS AMK S AVESGQADD ER VRE YYLLHLQR W IDISLEEIESI DQ 
EIJGCLRERDSSREASTSNSSRQERPPVKPFILTRNMAQAKVFGA 
GYPSLPTMTVSDWYEQHRKYGAIiPDQGIAKAAPEEFRKAAQQQE 
BQEEKEEEDDEQTLHRAREWDDWKDTHPRGYGNRQNMG 


5478 


2 


835 


KTVRIWVPNVKGESTVFRAHTATVRSVHFCSDGQSFVTASDDkT 
VXVWATHRQKFLF SLSQHXNWVRCAKFS PDGRLI VSASDDKTVIC 
LWDKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKVW 
DVRTHRLLQHYQLHSAAVNGLSFHPSGNYLITASSDSTLKILDL 
M EGRLL YTLHGHQGPATTVAFS RTGEYFAS GGSDEQ VMVWKS NF 
vxKjutujCt v i AV^Hi^^AiijrftoSMGNLTVSILEQRLTLEEDKIiKQC 
LENQQLIMQRATP 


5479 


2 


835 


KTVRI WVPNVKGEST VFRAHTATVRS VHFCS DGQS F VTAS DDKT 
VKVWATHRQKFIiFSLSQHINWVRCAKFS PDGRLIVSASDDKTVK 
LWDKSSRECVHSY CEHGGFVTYVDFHPS GTCI AAAGMDNTVKVW 
DVRTHRLLQHYQLHSAAVNGLSFHPSGNYLITASSDSTLKILDL 
MEGRLLYTLHGHQG PATTVAFS RTG E YFAS GG S DEQ VMVW KSN F 
D I GDHGE VTKVPRP PATLASSMGNLTVS I LEQRLTLEEDKLKQC 
LENQQLIMQRATP | 


5480 


444 


1952 

1 


LSLTSRMEEAELVKGRLQAITDKRKIQEEISQKRLKIEEDKLKH ' 
QH LKKKALR EKWLLDG IS SGKBQEEMKKQNQQDQHQ I QVLEQS I 
LRLEKEIQDLEKAELQISTKEEAILKKLKSIERTTEDIIRSVKV 
EREERAEES IEDIYANIPDLPKSYIPSRLRKEINEEKEDDEQNR 
KALYAMEIKVEKDLKTGESTVLSS I PLPSDDPKGTGI KVYDDGQ 
KS VYAVS SNHSAAYNGTDGLAP VEVEELLRQAS ERNS KS PTE YH 
EPVYANPFYR PTTP QRETVTPGPNFQERI KI KTNGLGIGVNES I 
4NMGNGLSEERGNN FNHI S PI PPVPHPRS VI QQAEEKLHTPQKR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
correBponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

(A= Alanine . CnCVRteine nsAanavt-io t\ — i J u_ 

Glutamic Acid, F« Phenylalanine, G^Glycine , 
H=»Histidine, I^Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P»Proline, Q=Glutamine, RaArginine, 
S*Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








lmtpweesnvmqdkdapspkprlsprbtifgksehqnssptcqb" 
deedvrywivhslppdindtepvtmifmgyqqaedseedkkflt 
gydgi ihaelwiddeeeedegeaekpsyhpiaphsqvyqpakp 
tflprkrsea9phekhk3 


5481 


3 


1422 


NSPGSVCLCQCVCPSLLHCLPPLLLtiLIiLPIiIiIjHESPQPPAIiRV 
VATSS DRNFMNKHQKP VLTGQRFKTRKRDE KE KFE PTVFRDTL V 

OGLNEAGDDIjEAVJlK'FT.nQ'TYlCDT FIVOTD VX TVPr TE»tvtt.'I77v/>oil*t a 
V» *m>* v^vjvr iMJo 1 VJortijU I KK lAJJl LaT 1J1 lAVABSMIi/^ 

PGGTRIDDGDKTK^r^NHCVFSANEDHETIRNYAQVFNKLIRRYK 
YLEKAFEDEMKKLLLFLKAFSETEQTKLAMLSGIIjLGNGTLPAT 
I LTS L FTDS LVKEG IAAS FAVKL FKAWMAEKDANS VTSS LRKAN 
LDKRliliELFPVNRQSVDHFAKYFTDAGLKELSDFLRVQQSLGTR 
XELQKELQERLSQECP I KEVVLYVKEEMKRNDLPETAVI GLLWT 
CIMNAVEWNKKEELVAEQWLKHLKQ YAPLLAVFSS QG QS EL I LI> 
QKVQEYCYDNIHFMKAF^KIVVLFYKADVLSEEAILKWYKEAKV 

2lTff31^QVT7T.nrtMWl7\/17CJT AMI Dt>T>G DCDnci DIT 
aauivo v r Jjjj^ri aac v is w .UUAliUs Jb£»o &oi£Ua fc»N 


5482 


1492 


528 


THWMTGMC YAPHQ VLS Y I NGVTTS KPGVS LVYSM PS RNLSLRL " 
EGLQEKDS G P YS CS VHVQDKQGKS RGHS I KTLELNVLVPPAP PS 
CRLQGVPHVGANVTLS CQSPRS KPAVQYQWDRQLP SFQTFFAPA 
LDVIRGSLSLTNLSSSr^GVWCiaumEVGTAQCNVTLEVSTGP 
GAAW AG AWGTLVG LG L LAGL VL L YHRRG KALE E PAND I KEDA 
T APRTTiPUPTf QQTYPTQ VMf , i'T T T.CC\r T T 1 CROnT TJnnupnnnnrnT mo 

XAriViwr VHT *waUi aonntjii40S V 1 JMiUiUK r r flLiy VH PK AT if K 

TPSLSSQALPSPRLPTTDGAHPQPISPIPGGVSSSGLSRMGAVP 
VMVPAQSQAGSLV 


^483 


j 1 


788 


FPFFKGCRAGRGNESDYRKLEEMHQRFLVSERSKDDLQLRLTRA 
ENRI KQLETDS S EE I S R YQEW I QKLQNVLES E REN CGLVS EQRL 
KI^QENKQLRKETESLRKIALEAQKKAKVKISTMEHEFSIKERG 
FEVQLREMBDSNRNSIVELRHLLATQQKAANRWKEETKKLTESA 
F I RINNLKS ELSRQKLHTQELLSQLEMANEKVAENEKLILEHQE 
KANRLQRRLSQAEERAASASQQLSVITVQRRKAASLMNLENI 


5484 

T 


3 


1997 

- 


IMADMEDLFGSDADSEAERiCDSDSGSDSDSDQEKAASGSNASGS 
ESDQDERGDSGQPSNKELFGDDSEDEGASHHSGSDNHSERSDNR 
SBASERSDHEDNDPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSE 
AEGSEKAHSDDEKWGREDKSDQSDDEKION'SDDBERAQGSDEDK 
LQNSDDDEKMQNTDDBERPQLSDDERQQLS BEE KANSDDERP VA 
SDNDDEKQNSDDEEQPQLSDEEKMQNSDDERPQASDEEHRHSDP 
EEBODHKSESARGSDSEDEVLRMjaRKKAIASDSEADSDTEVPKD 

M QfSTMT^T .TWV^ A riT\ T o o fVUnVBTynnnrtni mmviT T»\wf*#>wwn 
wovy i nuijruwu;uXobljolA>sUArlri irUyJr VDbNGL»PQDQQEEE 

PIPETRIEVE I PKVNTDLGNDLYFVKLPNFLSVEPRPFDPQYYE 

DEFEDEEMLDEEGRTRLKLXVENTIRWRIRRDEEGNEIKESNAR 

IVKWSDQSMSI»BliGireVFDVYKAPLQGDHNHLFlRQGTGIiGX3QA 

VFKTKLTFRPHSTDSATHRKMTLSIiArRCSKTOKlRILPMAGRD 

PECQRTEMIKKEEERLRAS I RRES QQRRM RE KQHQRGLSAS YLE 

PDRYDEEEEGEESISLAAIKNRYKGGIREERARIYSSDSDEGSE 
EDKAORIiLKAKKLTSDEVRPJJIjPWfiRfiT.QrTnRP'rivT.MPWT t*r%ri 
AGTN 


5485 


161 


1074 


KRKI LSSMMDSEAHEKRP PILTSSKQDISPHITNVGEMKH YLCG 
CCAAFNNVAITFPIQKVLFRQQLYGI fCTRDAlLQLRRDGFRNL Y 
RG ILP P LMQKTITLAIjMFGLYEDLS CLLHKHVS AP E FATS G VAA 
VLAGTTEAI FTPLERVQTLLQDHKHHDKFTNTYQAFKALKCHGI 
GEYYRGLVP 1 LFRNGIiSNVLFFGLRGPIKEHLPTATTHSAHLVN 
DFICGGLLGAMLGFLFFP3NVVKTRIQSQIGGEPQS FPKVFQKI 
WTiERDRKLINLFRGAHLNYHRSLISWGIINATYEFLLKVI 


5486 


1404 


142 


I PGST I S WSPAAARGLSVCRCCRLHPASAMDLFGDliPEP ERS PR 
PAAGKEAQKGPLLFDDLPPASSTDSGSGGPLLFDDLPPASSGDS 
GS LATS I SQMVKTEGKGAKRKTSEEEKNGS EEL VEKKVCKAS S V 
IFGLKGYVAERKGEREEMQDAHVILNDITEECKPPSSLITRVSY 
FAVFDGHGG I RAS KFAAQNLHQNLIRKFP KGDVI S VEKTVKRCL 
LDTFKHTDEEFLKQASSQKPAW2CDGSTATCVLAVDNILYTANLG 
DSRAILCRYNEBSQKHAALSLSKEHNPTQYEERMRIQKAGGNVR 
DGRVLGVLEVSRSIGDGQYKRCGVTSVPDIRRCQLTPNDRFILL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, DssAspartic Acid, B= 
Glutamic Acid, FsPhenylalanine, G=Glycine, 
H*Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, [^Methionine, N=Asparagine , 
PoProline, QaGlutamine, R=Arginine, 
S=Serine, T=*Threonine , V=Valine, 
W -Tryptophan, Y=> Tyrosine, X^Vnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








acdglfkvftpeeavnfilscledekiqtregj<saadaryeaac 

NRLANKAVQRGSADNVTVMWRIGH 


5487 


53 S 


■ 182 


AVSLEQ IRG LQT PAP VPLPLQP C PSNCDMERVTLALLLLAGLTA 
LSANDPFANKDD PPYYDWKNLQLS GL I CGG LLAIAG I AAVLSG K 
CKCKS3QKQHSPVPEKAIPLITPGSATTC 


5488 


1072 


259 


AMAASGB PQRQWQE E VAAVWVGS CMTDLVS LTS RXi P KTGETIH 
GHKFFIGFGG KGANQC VQAARLGAMTSMVCKVGKDS FGNDYIEN 
LKQNDISTE FTYQTKDAATGTAS 1 I VNNEGQNI I VTVAGANLLL 
NTED LRAAANVI S RAKVMVCQLE I TPATSLE ALTMARRSGVKTL 
FNPAPAIADLDPQFYTLSDVFCCNESEAEILTGLTVGSAADAGE 
AALVLLKRGC^VVlITLGASGCWluSQTEPEPKHIPTEKVKAVD 
TTVSFKI 


5489 


81 


893 


GKGPVAAFIDQSNIFLTDPKlFLGQWREEPKMPLLIiiX»ET^PljK 
LERD CRS PVE P WAAAS PDLALACL CHCQDLS SG AFPNRG VLGG V 
LFPTVEMVIKVFVATSSGS IAIRKKQQE WGFLEAMKIDFKELD 
IAQDEDNRRWMRENVPGEKKPQNGI PLPPQI FNEEQYCGDFDSF 
FS AKEENI I YS FLGLAP PPDS KG SEKAEEGGETEAQKEGS EDVG 
NLP EAQEKNE E EGETATEET3 E I AMEGAEGEAEEEEETAEGESP 
GEDEDS 


5490 


81 


893 


GKGPVAAFIDQSNI FLTDPKI FLGQWREEPKMPLLLLGBTEPLK" 
LERDCRS PVEPWAAASPDIiAIACLCHCQDLSSGAFPNRGVLGGV 
LFPTVEMVIKVFVATSSGS IAIRKKQQE WGFLEANKIDFKELD 
I AGDEDNRRWMRENVPGBKKPQNGI PLPPQI PNEEQYCGDFDS F 
FS AKEENI I YSFLG LAP PPDSKGSEKAE EGG ETEAQKEGSEDVG 
NLPB AQE KNEBEGETATEE TE E IAMEGAEGEAE EEEETAE GEE P 
GEDEDS 


5491 


204 


1194 


GSAPRLS LG PTGAQARDPD WWARPPSRP YTQ SKEDRPDTEGRS E 
QGDMAS S FLPAGAITG DSGGE LSSGDDSGEVEFPHS PEIEETS C 
LAELFE KAAAHLQG LI QVASREQLLYLYAR YKQVKVGNCNT P KP 
SFFDFEGKQKWBAWKALGDSSPSQAMQEYIAVVKKLDPGWNPQI 
PEKKGKEANTGFGGPVISSLYHEETIREEDKNIFDYCRENNIDH 
ITKAIKSKNVDVNVKDEEGRAI^iHWACDRGHKELVTVLLQHRAD 
t INCQDNEGQTALHYASACE FLD IVELLLQSGADPTLRDQDGCLP 
EEVTGCKTVSLVLQRHTTGKA 


5492 


3 ! 


1896 


ASKNPLSAVCTTGIMSSLAVRDPAMDRSLRSVFVGNIPYKATtili 
QLKDIFSEVGSWSFRLVYDRETGKPKGYGFCEYQDQETALSAM 
RNLNG RE FS GRALRVDNAAS EKNKEE LKS LG P AAP I IDSPYGDP 
IDPEDAPES ITRAVASLPPEQMFELMKQMKLCVQNSHQEARNML 
IX3NPQIAYALLQAQVVMRrMDPEIALKILKRKIHVTPLIPGKSQ 
SVSVSGPGPGPGPGLCPGPNVLLNOX3NPPAPQPQELARRPVKD I 
P PLMQTP IQGGI PAPGP I PAAVPGAGPG S LTPGGAMQP QLGMPG 
VGPVPLERGQVQMSDPRAPIPRGPVTPGGLPPRGLLGDAPNDPR 
GGTLIiSVTGEVEPllGYIiGPPHQGPPMHKASGHDTRGPSSHEMRG 
GPLGDPRLLIGEPRGPMIDQRGLPMDGRGGRDSRAMETRAMETE 
VLE TOVMERRG^TCAMETRGMEARGMDARG L EMRG P VP S S RG P 
MTGG IQG PGP INIGAGG P PQG PRQVPGI SGVGNPGAGMQGTG I Q 
GTOMQGAGIQGGGMO^GIQGVSIQGGGIQGGGIQGASKQGGSQ 
PSSFSPGOSQVTPQDQEKAALIMQVLQLTADQIAMLPPEQRQSI 
LILKEQIQKSTGAS 


5493 


1 


1876 


lU^MMTKAVPEEPRK^GRLTOAI^SPLTWEHVWICVPC^TPDCL 

TDTFRVKRPHLRRSASNGIJVPOTPVYREKEDMYDEIIELKKSLH 

VQKSDVDIjMRTKLRRLEEENSRKDRQ IEQLLDPSRGTDFVRTLA 

BKRPDASWVINGLKQRILKLEQQCKEKDGTISKIjQTDMKTTNLE 

EMRLAMETYYEEVHRLQTLIiASSETTGKKPLGEK^ 

GS ALLSLSRS VQE LTBENQS LKEDLDR VLSTS PTI SKTQGYVEW 

SKPRIJJRRIVELEKKIiSVMESSKSHAAEPVRSHPPACLASSSAL 

HRQPRGDRNKDHERLRGAVRDLKEERTALQEQLLQRDLEVKQLL 

QAKADLEKBLECAREGEEERRERE BVLREE IQTLTS KLQE LQEM 

KKEEKEDCPEVPHKAQELPAPTPSSRHCEQDWPPDSSBEGLPRP 

R3 P CSDGRRDAAAR VLQ AQ WKVYKH KKKKAVL DEAA WLQAA FR 
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ID 

NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

I amino acid 

J sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A»Alanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, GsGlycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
Leucine, M=Methionine, N=Asparagine, 
PaProline, Q=Glutaniine , R=Arginine, 
S=Serine, T=Threonine, V«*Valinc, 
WoTryptophan, Y-Tyroaine, X=UnJcnowr_, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GHl>TRTKI^ASKAHGSEPPSVPGLPDaSSPVPRVPSPlA^ATGS^ 
PVQEBAI VI IQSALRAHLARARHSATGKRTTTAASTRRRS ASAT 
HGDASSP PFLAALPDPSPSGPQAVAPLPGDDVNSDDSDD I VXAP 
SLPTKNFPV 




71 


536 


RSKAKIGTPTRBVPSTDMXVRRESSSSLTHRPAPSPATPRLLGT~" 
RRVLLGVS EGTGCADAMELVLVFLCS LLAFMVIiASAAE KE KEKD 
PFHYDYQTLRIGGLVFAWLFSVGILLILSRRCKCSPNQKPRAP 
GDE BAQVENLI TAN ATE PQKAEN 


| 5495 


273 


2168 


LlL IQVDTMP FTLHLRSRLPS AI RSL I LQ KKPN 1 R NT S SMAG 
ELRPASLWLPRSIAPAFERFCQVKTGPLPLLGQSEPEKWMLPP 
QGAISETRMGHPQFWKYEFGACTGSIiASLEQySBQLKDMVAFFL 
GCSFSLEEALEKAGLPRRDPAGHSQAGAYKTTVPCVTHAGFCCP 
LWTMRPI PKDKLEGLVRACCSLGGEQGQPVHMGDPELLGI KEL 
SKPAYG33AMVCPPGEVPVFWPSPLTSLGAVSSCETPLAFAS IPG 
CTVMTDLKDAKAPPGCLTPERI PE VHHIS QD PLH YS IAS VSASQ 
KIRELESMIGIDPGNRGIGHLLCKDELLKASLSLSHARSVLXTT 
G FPTO FNHEP PEETDGPPGA VALVAFLQALEKE VAII VDQRAWN 
LHQKIVEDAVEQGVLKTQIPILTYQGGSVEAAQAFLCKNGDPQT 
PRFDHLVAIERAGRAADGNYYNARKMNIXHLVDPIDDLFLAAKK 
IPGISSTGVGDGGKrELGMGKVKEAVRRHIRHGDVIACDVEADFA 
VXAGVSNWGG YALACAIiYI LYS CAVHS QYLRKAVG PSRAPGDQA 

WTQ^PSVIKBEKMIjGILVQHKVRSGVSGIVGMEVDGLPFHNTH 
AEMIGKLVDVTTAQV 


5496 


3 


2408 


QDT KMHE I YKGN I TPQLNKNTLKTS AATDVWAVYFS QP W I D Y3G " 

mksgkgrpispvdsfplsiwicqptryaesqkepqtcnqvslnt 
sqsessdlagrlkrkkllkeyystesepltnggqkpsssdtffr 
fspssseadihllvhvhkhvsmqinhyqyllllfiiheslillse 
nlrkdveavtgspasqtsicigillrsaelalllhpvdqantlk 

SPVSESVSPVVPDYIiPTENGDFLSSKRXQISRDINRIRSVTOJH 
MSDKRSMSVDLSHI PLKDPLLFKSASDTNLQKGIS FMDYLSDKH 
LGKI S EDESSGLVYKSGSGE I GSET S D KKDS FYTDS S S VLNYR3 
DSNII^FDSrX^QNILSSTLTSKGNBTIESIFKAEDLLPEAASI, 
SENLDISKEETPPVRTLKSQSSLSGKPKERCPPNIjAPLCVSYKN 
MKRSSSQMS LDTI S LDSM I LE EOLLE SDGSDSHM PT ,F If nwrr ttmq 
TTNYRGTAES VNAGANLQNYGETS PDAI STNSEGAQENHDDLMS 

vvvfkitgvngeidirgedteiclqvnqvtpdqlgnislrhylc 
nrpvgsdqkavihsksspeislrfesgpgavihsllaekngflq 

raiKNFSTEFLTSSIJ^IQHFLEDETVATVMPMKlQVSNTKINI, 
KlJI^PRSSWSLEPAPVTVHIDHLVVERSDDGSFHIRDSHMLNT 
GUDLKENVKS DS VLLTSGKYDLKKQRS VTQATQTS PGVPWPSQS 

ANFPBFSFDFT^BQLfffiENBSLKQEIiAKAKMALAEAHLEKDALL 
HHIKKMTVE 


5497 


1821 


3308 


S I S KtiLKRRSNIDAYLLSNSCAFFAPR JUFS LASQ 1 1 REQQS PNV" ' 

CFIYKYSGFPSLBCQCHFVSPHSSCYIWFFSFPPPFFVCFQLSN 

GFSHYSI^SESHVGPTGAGLFPHCLPASRLLPRVTSVHLPDYAH 

YYTIGPGMFPSSQ I PSWKDWAKPGPYDQPLVNTLQRRKEKREPD 

PNGGGPTTASGPPAAAEEAQRPRSMTVSAATRPGEEMEACEELA 

LAIiSRGLQLDTQRSSRDStiQCSSGYSTQTTTPCCSEDTIPSOVS 

DYDYFSVSGDQEADQQEFDKSSTI PRNSD1SQS YRRMFQAKRPA 

STAGLPTTLGPAMVTPGVATIRRTPSTKPSVRRGTIGAGPIPIK 

TPVI PVKTPTVPDLPGVLPAPPDGPEERGERS PES PSVGEGPQG 

VTSMPSSMWSGQASVWPPliPGPKPSIPEEHRQAIPBSEAEDQER 

EPPSAWSPGXJIPESDPADLSPRDTPQGEDMLNAlRRGVKIiKKT 
TTNDRSAPRFS 


5498 


2434 


1492 

; 


ILTHQEIFTGEXPCECGKASIQMSHLSQQKIYSGENPFACKVCG 
KVFSHKSNLTEHEHFHTR2KPFECNECGKAFSQKQYVIKHQNTH 
TGEKLFEQJECGKSFSQKEl^LTHQKIHTGEKPFECKDCGKAFI 
□KSNLIRHQRTHTGEKPFVCKECGKTFSGKSNLTEHBKIH1GEK 
PFKCSECGTAFGQKKYLIKHQNIHTGEKPYECMECGKAFSQRTS 
LIVHVRIHSGDKPYECNVCGKAFSQSSSLTVHVRSHTGEKPYGC 
^CGKAFSQFSTLALHIiRIHTGKKPYQCSECGKAFSQKSHHIRH 
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SEQ 
ID 
NO: 


1 Predicted ~~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=*Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=»Leucine, M=Methionine, N»Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, Threonine, V« Valine, 
wsixypcopnan, i ^Tyrosine, X=unlaiown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 

QKIHffi " ■ 


54 99 


324 


926 


GFGQIGRGHKI "TYPFS PRKSGRKGMAQSQGWVKRYI KAFCKGF ' 
FVAVFVAVTFLDRVACVARVEGASMQ?SLNPGGS(^SDVVL^ 
WKVRNFEVHRGDIVSLVSPKNPEQKI1KRVIAXEGDIVRTIGHK 
NRYVKVPRGHIWVEGDHHGHSFDSNS FGPVSLGIjLHAHATHI LW 
PPERWQKLESVLPPERLPVQRBEE 


S500 


1978 


1286 


KPDWRLQNLPPRLYLWRSSRFGFGHLKKRLQMDFKIEHTWDGPP ' 

VKHBPVFIRLNPGDRGVMMOISAPFFRDPPAPLGBPGKPFNEIiW 

DYEWEAFFLNDITEQYLEVELCPHGQHLVLLLSGRRNVWKQEL 

PLSFRVSRGETKWEGKAYLPWSYFPPNVTKFNSFAIHGSKDKHS 

YEALYPVPQHBLQQGQKPDFHCLByFKSFNFNTL^GEEWKQPSS 

DLWL1BKCDI 


5501 


2927 


2226 


CRPPVSARVAPGHQGAVGGSGRRPARVEWDAAARPSSRPFSLP " 
AAIMI^ISRLLDWFRSLFWKEEMELTLVGLQYSGKTTFVNVIA 
SGQF SE DMI PTVGFNMRKVTKGNVTI KI WDI GGQ P RFRS MWERY 
CRGVNAI VYMIDAADREKI EASRNELHNLIiDKPQLQGI PVLVIiG 
NKRDLPNAUDBKQLIEKMNLSAIQDRE I CCYS I S CKEKDN I D I T 
IiQWLIQHSKSRRS 


5502 


3 


624 


nsafpvwvpe*tai*tcploaapo^^ 

GKFFKGGGSSKSRAAPSPQEALVRLRETEEMLGKKQEYLENRIQ 
RE I ALAXKHGTQNKRAALQ AL KRKKR FE KQLTQ IDGTkST I E FQ 
REALEN SHT^EVLRNMGFAAKAMKS VHENMDLNKI DDLMQ E I T 
EQQDIAQEISEAFSQRVGFGDDFDEDELMAELEEIjEQEELNKKM 
TN I RLPNVP SSSLPAQPNRKPGMS STARRSRAAS S QRAEEEDDD 
IKQLAAWAT 


5S03 


216 . 


654 


KG VRRRGRVRSDS EDS HLG Y FKMS FL ti P KLTS KKEVDQAiKSTA 
EICVLVLRFGRDEDPVCLQLDDI LSKTSS DLSKMAAIYLVDVDQT 
AVYTQ YFD I S Y I PSTVFFFNGQHMKVDYGGEDPALRS IKAVRRT 
SPAGTLGEKPVNS 


5504 


58 


3563 


QLSFSFQAPVTFDDITVYLLQEEWVLIjSQQQKELCGSNKLVAPL 

GPTVANPSLFRKFGRGPEPWI^SVQGQRSLLEHHPGKKD^GyMG 

EMEVQGPTRESGQSLPPQKKAYLSHI*STGSGHIEGDWAGRNRKL 

LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREYPSIRDKRSRL 

I EGYTG P FKVETLKYHAKS KAHMFCVNALAARDPI WAARFRS IR 

DPPGDVLASPEPLFTADCP I FYPPGPLGGFDSMAELLPSSRAEL 

BDPGGIX3AIPAMYIJJCISDLRQKEITDGIHSSSDIWILrM2AVE 

SCIQDPSAEGLSEEVPWFEELPWFEDVAVYFTREEWGMLDKR 

QKELYRDV>D^5NYELLASI/jPAAAKPDLISKLERRAAPWIKDPN 

GPXWGKGR PPGN KK^AVREADTQASAADSAIiliPGS P VEARAS C 

CSSSICEEGDGPRRI KRTYRPRS IQRSWFGQFPWLVIDPKETKI» 

FCSACIERPNLHDKSSRLVRGYTGPFKVETLKYHEVSKAHRLCV 

NTVE IKEDTPHTALVPE ISSDLMANMEHFFNAAYS IAYHSRPLN 

DFEKILQIJ.QSTCTVILGKYP^RTACTQFIKYISETLKREIIiBD 

VRNS PCVSVLLDSSTDASEQACVGI YIRYFKQMEVKESYITLAP 

LYSETATXTYFETIVSAIiDELDIPFRKPGWVVGLGTDGSAMLSCR 

GGLVEKFQEVT PQLLP VHCVAHRLHLAWDACGS I DLVKKCDRH 

IRTVFKFYQSSNKRLNELQEGAAPLEQEI IRLKDLNAVRWASR 

RRTLHALLVS W PALARHLQR VAEAGGQ I GHRAKGMLKLMRGFHF 

VKFCHFLLDFLS I YRP LS E VCQKEIVLI TEVNATLGRAYVALES 

LRHQAGPKEEBFNASFKDGRLHGICLDKLEVAEQRFQADRERTV 

LTGIEYI^RFDADRPPQLICNMEWDTMAWPSGIELASFGNDDI 

IaNIiARYFBCSLPTOYSEEALLEEWLGLKTIAQHLPFSMLCKNAL 

AQHCRFPLLS KLMA VWCVP I STSCCERGFXAMNRIRTDERTKL 

SNEVLNMr^TAVNGVAVTEYDPQPAIQHWYLTSSGRRFSHVYT 

CAQ VPARS PASARLRKEEMGAL YVEE PRTQ KP P I LPSREAABVL 

KDCIMEPPERLLYPHTSQEAPGMS 


5505 


"T312 


1219 


ncsprsi^aaki^snrnnnklpsnlpqlqnlikrdppayieeflq" 

QYNHYKSNVE I FKLQPNKPS KELAELVMFMAQI SKCYPE YLSNF 
PQEVKDLLS CNHTVLDP DLRMT FCKALI LLRNKNLINPSSLIjEL 
FFELFRCHDKLLRKTL YTHI VTD I KN INAKH KNNKVNWLQN FK 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=>Alanine, CaCysteine, D=*Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H=Histidine, I-Isoleucine, K= Lysine, 
LaLeucine, Methionine, N«Asparagine, 
P°Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Vallae, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








YTMLRDSNATAWCMS LDVMIELYRRNIWNDAKTVNVI TTACFS K 
VTKILVAALTFFIGKDEDSKQDSDSESEDDGPTARDLLVQYATG 
KKSS KNKKKLEKAMKVLKKHRKKKKPEVFNFSAIKLIHDPQD FA 
EKLLKQLECCKERFEVKMMLMNLI SRLVG IHELFLFNF YP FLQR 
FLOPHQRBVTKILLFAAQASHHLVPPEIIQSLLMTVA^FVTDK 
NSGE VMTVGINAIKE 1 TARCPIJIMTEELLQDLAQYKTHKDKMVM 
MS ARTL IHL FRTLNPQMLQ KK FRGKPTEAS I EARVQEYGE LD AK 
DYIPGAEVLEVEKEEMAENDEDGWESTSLSEEEDADGEWIDVQH 
SSDEBQQEISKKLNSMPMBERKAKAAAI5TSRVLTQEDFQKIRM 
AQMRKEIJ>AAPGKSQKRKYIEIDSDEEPRGELLSLRDIERIjHKK 
P KS DKETRIiATAMAGKTDRKEFVRKKTKTN PFSS STNKEKKKQK 
NFMMMRYSQNVRSKNKRSFREKQLALRDALLKKKKRMK 


550« 


1 


1531 


FRGDLCGQRGGSAPGEGGSSAWPAPAkPLPEREREREALCPGRS 
CSGGGGEETPGTTPVWS PliEGGGDEELRPNPYVRFPYRWWAVW 
LAAFPSLGAGGETPEAPPESWTQXiWFFRFVVNAAGYASFMVPGY 
LL VQ Y FRRKNYLETGRGLCFPIiVKACVFGNE PKASDE VPLAPRT 
EAAETTPMWQALKLLFCATGLQVSYLTWGVLQERVMTRSYGATA 
TS PGER FTDSQFIiVLMKR VLAXi IVAGLS CVIiCKQPRHGAPMYRY 
SFASLSNVLSSN^YEALKFVSFPTQVtiAKASKVIPVMLMGFOLV 
SRRSYEHWBYLTATLISIGVSMFLLSSGPEPRSSPATTLSGLII* 
LAG YI AFDS FTSNWQDALFA YKMS SVQMMFGVNFFS CLFTVGSL 
LEQG ALLEGTRFMGRHS E FAAHALLLS ICS ACGQLFIFYTI G QF 
G AAVF T 1 1 MTLRQAFA ILLS CLLYGHTVTWGGLGVAVVFAALJj 
LRVYARGRLKQRGKKAVPVESPVQKV 


5507 


3704 j 


1271 


PRGTRRCRPAGRASRRARRRPPCPGPAAPGSLEIGGFGTAAGKK " 
VAVAD VQFGP^fRFHQDQ)^}VtJCiVFTKED^^QCNGFCRACE KAG FK 
CTVTKEAQAVLACFLDKHHDIIIIDHRNPRQLDASALCRSIRSS 
KJ.iS ENTVI VGVVRRVDREE1*S VMPFIS AGFTRRYVENPN I MACY 
l^LLQLEFGEVRSQLKLRACWSVFTALENSBDAISITSEDRFIQ 
YAN PAFETTMG YQSGELI GKBLGBVP INEKKADLLDT INS C I RI 
GKEWQGI YYAKKKNGDNIQQNVKI I PVIGQGGKIRHYVS I IRVC 
NGNNKAEKISBCVQSDTHTDNQTGKHKDRRKGSLDVKAVASRAT 
E VSSQRRHSSMARIHSMT 1 EAP ITKVINI I NAAQES S PM P VTEA 
LDRVLE ILRTTELYSPQFGAKDDDPHANDLVGGLMSDGLRRLSG 
NEYVLSTKNTQMVS SN I ITP I S LODVP PRI ARAMENEEYWDFD I 
FBLEAATHNRPLIYIiGLKMFARFGICEFLHCSESTLRSWLQI IE 
ANYHSSNPYHNSTHSADVLHAXAYFXSKERIKETLDPIDEVAAL 
IAATIHDVDHPGRTNSFLCNAGSELAILYNDTAVLESHHAALAF 
QLTTGDDKCNIFKNMERNDYRTLRQGI I DMVLATEMTKHFEHVN 
KFVNS INKPLATLEENGETDXNG^VINTMIiRTPENRTLIKRMLI 
KCADVSNPCRPIiQ YCI EWAAR I SEEYFSQTDEE RQQGLFWMP V 
FDRNTCS IPKSQIS FIDYFITDMFDAWDAFVDLPDLMQHLDNIXF 
KYWKGLDEMKLRNLRPPPB 


5508 


1151 


S9l 


LSSVFSRRSASMFAVGCSMGPFLHYV?YLSLDRLFPASGLRGFPN ~ 
VLKKVLVlX?LVASPLLGWYFIiGLGCIiEGOTVGBSCQEriREKFW 
E F YKAD W C VWP AAQ FVNFLFVP PQ FR VTY I NGLTLG WDTYL S Y L 
KYRSPVPLTPPGCVALDTRAD 


5509 


123 8 


619 


RK5 RGCQNALSASG PAAAAAAI MVRKLKFfatEQKLti)(QVDFLN^ E 
VTDHNLHELR VI RRYRLQRRED YTRYNQLS RAVRELARRLRDL P 
ERDQFRVRASAALLDKLYALGLVPTRG SLELCD PVTASSFCRRR 
LPTVLLKLRMAQHLQ AAVAF VEQGHVRVG PDWTDPAFI>VTR S M 
ED FVTWVDSS KI KRHVLEYNEERDDFDLEA 


5510 


96 




rjVjHtiiji* a k VEPGKGR VGAR VKGERGLQASGSAPGRS KM 
AEGERQPPPDSSBEAPPATQNFI IPKKEIHTVPDMGKWKRSQAY 
ADYIG F I LU'LNEGVKGKKLTFE YRVS EAIE KLW^LNTLDR W I D 
ETPPVDQPSRFGNKAYRTWYAKI^EEAENLVATVVPTHLAAAVP 
E VAVYLKES VGNSTR I DYGTGHEAAFAAFLCCLCKIG VLRVDDQ 
lAIlTFKVFNRYLF^RKI^PCTYRMEPAGSQGVWGLDDFQFLPFI 
WGSSQLIDHPYLEPRHFVDEKAVNENHKDYMFLEC1LFITEMKT 
GPFAEHSNQLWNISAVPSWSKVNQGLIRMYKAECLEKFPVIQHF 
KFGSLLPIHPVTSG 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


~| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
lAo/uanine, c»cysteine, D=Aspartic Acid, E» 
Glutamic Acid, FoPhenylalanine, G^Glycine, 
HaHistidine, I=Isoleucine, K«Lysine, 
L=Leucine, M«Mechionine, N^Aeparagine, 
PoProline, Q»Glutaraine, R=Arginine, 
S«Serine, T- Threonine, V=Valine, 
W=*Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 


j 5511 


276 


I960 

• 


KLSRVLNLPPENHTSISAVPiSQK^EVADFQLSVDSLLEKDND 
HSI^DIQVQAKRLAEKIiRCDTVVSEISTGQRTVNliTClNRELIiTK 
TVLQQVI EDGS KYGLKSELFSGLPQKKI WEFSS PNVAKKFHVG 
HLRSTIIGNPIANLKBALGHQVIR1NYLGDWGMQFGLLGTGFQL 

LGD VQ AL S L WQKFRDLS I BEY I RVYKRLGVY FDE YSG ES F YRE K 
SQEVLKhhESl^GLLLKTIKGTAVVDLSCNGDPSSICTVMRSDGT 
S LY ATRDLAAAI DRMD KYNFDTM I YVTDKGQKKHPQQVFQMLKI 
MG YDWAERCQHV P PGWQGMKTRRGDVTPIiEDVLNE I QLRMLQN 
MASIKTTKELKNPQETAERVGLAALIIQDPKGLLLSDYKPSWDR 
VFQSRGirroVPLQYTHARLHSIiEETFGCGYLNDFNTACLQEPQS 
VS I LQHLIiR FDEVL YKSS QDFQ PRH I VS YhLThSHhAAVAHKTL 
QIKDSP PEVAGARLHLFKAVRSVLANGMKLIjG itpvcrm 


(5512 


120 


1015 


DPSLIiLTITVTGVTVLVL^LtoMNSRRRfiPITLQDPEAKYPLPL 
lEKEKlSHNTRRFRFGLPSPDHVLGLPVGNYVQLLAXIDNELW 
RAYTPVSSDDDRGFVDLI I KI YPKNVHPQYPEGGKMTQYZiENMK 
IGETIFFRGPRGRLFY1IGPGNIK3IRPDQ/TSEPKKTIADHLGMIA 
GG TG I TPMLQLI RK I TKDPSDRTRM3L I PAN QTE ED I LVRKELE 
BIARTHPDQFDIiWYTLDRPPIGWKYSSGFVTADMIKEHLPPPAK 
STLILVCGPPPLIQTAAHPNIiEKliGYTQDMI FTY 


5513 


2 


837 


ARWRLPSDSPRIPPAGAETPGRG^CRNYLPSSSPPPPEPSSFPS 
P PTSRGGPGSRDTMS DS EEESQDRQLKI WLGDGASG KTS LTTC 
FAQETFGKQYKO/TIGLDFFIiRRITIiPGNIiNVTLQIWDIGGQTlG 
GKrtLDKYIYGAC^3VLLVYDITKYQSFENLEDWYTVVKKVSEESE 
TQPLVALVGNKIDLEHMRTIKPEKHLRFCQENGFSSHFVSAKTG 
DS VFLCFQKVAAEILGIKLNKAEIEQSQRWKAD IVNYNQEPMS 
RTVNPPRSSMCAVQ 


5514 


1295 
i 


449 


VNRPSWIMGNFRGHALPGtFFF!tlGtWWC^^ILKYICKKQKRT 

CYLGSKTLFYRLEILEGITIVGMALTGMAGBQFI PGGPHLMLYD 

YKQGHWNQLLGWHHFWYFFFGLLGVADILCFTISSLPVSLTKL 

MLSKALFVBAFIFYNHTHGREMIJ)IFVHOLLVLVVFIjTC 

EFLVRNNVLLELLRSSIilLWJGSWFFQIGFVLYPPSGGPAWDLM 

DHENILFLTICFCWHYAVTIVIVGMNYAFITWLVKSRLKRLCSS 

EVGLLKNAEREQESEEEM 


5515 
" 5516 " 


1572 


260 


FVRLVGP^DCDPI^VCLrTMPtVEGLGSGGEKTAVVIDLGEAF 
TKCGFAGETGPRCIIPSVIKRAGMPKPVRWQYNINTEELYSYL 
KEFIHILYFRHLLVNPRDRRWI IESVLCPSHFRETLTRVLFKY 
FEVPSVU^SHLMALLTLGINS^ 

VLNCTGALPLGGKALHKELETQLLEQCTVDTSVAKEQSLPSVMG 
SVPEGVLEDIKARTCFVSDLfKGIiKIQAAiOWIDGNNERPSPPP 
NVD YP LDGEKI LH I LGS IRDS WE I LFEQDNEEQ S VATL I LDS h 
r x urKKQIjAENLi WI GGTS MIjPGFliHRtiLAEIRYI*VEKPKY 
KKAIX3TKTFRIHTPPAKANCVAWLGGAI FGALQDILGSRS VSKE 




3 


735 


N5RE PPQAGPQPS PRKS PTASS FLFP WRPIASSF WMGAQGAQES 
I KAMWRVPGTTRRPVTGES P^MHRPEAMI»LLLT1ALI»GGPTWAG 
KMYGPGGGKYFSTTEDroHElTGLRVSVGLLLVKSVQVKIiGDSW 
DVKLGALGGNTQEVTLQPGE Y I TKVFVAFQAFLRGMVMYTS KDR 
YFYFGKLDGQISS AYPSQEGQVLVG1YGQYQLLG I KS IGFEWNY 
PLEEPTTEPPVNLTYSANSPVGR • 


5517 




499 


SEI YVAMRTDSS KMTDVESG VANFASSARAGRRNALPDIQSSAA - " 
TDOTS DLPLKLEALS VKEDAKE KDEKTTQDQLEKPQJfEEK 


5518 


3 


1375 

] 
< 


DAWADAWVRAWDLNMDFPCLWLGLLLPLVAALDFNYHRQEGMEA 
FLKTVAQNYSS\m{LHSIGKSVKGRNLWVLVVGRFPKEHRlGIP 
E FKYVAHMHGDETTORELLLHIiIDYLVTSDGXDPE ITNLINSTR 
IHIMPSMNPDGFEAVKKPDCYYS IGRENYNQYDLNRNFPDAFE Y 
t^SRQPElVAVMKWLJaOTFVLSANIiKGGALVAS YPFDNGVQA 
roALYSRSLTPDDDVFQYLAHTYASRNPNMKKGDECJGJlKMNFPN 
3VTN3YSWYPLQGGMQDYNYIWAQCFEITLELSCCKYPREEKLP 
SFWNNNKASLI EYI KQ VH OGVKGQVFDQNGNPL PNVI VE VQDRK 
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sSqT 

ZD 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
{A= Alanine, C«=Cysteine, DaAspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PcProline, Q=Glut amine, ReArginine, 
S^Serine, T=Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X«Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\epossible nucleotide insertion) 








HICPYRTNKYGEYYLLLLPGS yiinvtvpghdphitkvi IPEKS 
QN FS ALKKD I LLP FQGQLDS 1 P VSNPS CPM I PLYRNLP DHSAAT 
KPSLFLFLVSLLHIFFK 


5519 


87 


477 


I KSKLNQQVEVQESBWRLTEAKGPTMGKESGWDSGRAAVAAWG 
G WAVGTVLVALS AMG PTSVG IAASS IAAKMMS TAAI ANGGGVA 
AGSLVAILQSVGAAGLSVTSKVIGGFAGTALGAWLGSPPSS 


5S20 


117 


943 


PTEGRQKVLKTFTVPRSALAMTKTSTCIYHPLVLSWYTFLNYYI 
SQEGKDEVKP fCILANGARWKYMTIjUJr.TT.nTT wnxrvr-r nru/r v 

RTKGGKDIKFLTAPRDLLFTTLAFPVSTPVFLAFWlIiFLYNRDL 
IYPKVLDTVIPWLNHAMHTPIFPITLAEVVLRPHSYPSKKTGL 
TIJ^AAASIAYISRILWLYFETGTWVYPVPAKLSLLGIAAPFSLS 
YVF IAS I YLLGEKLNHW KWVSVQ IliQRWRLESVGI CFQ W PDWXS 
PAKHQLVKNIR 


5521 


S4<* 


911 


K I LNMQKS CEEN EG K P QN M PKAE t)R PLE DVPQEAEGN PQ PS H E 
GVSQBAEGNPRGGPNOPGQGFKEDTPVRHLDPEEMIRGVDELER 
LREEIRRVRNKFVMMHWKQRHSRSRPYPVCFRP 


" 5522 


1224 


63 7 


GSRPLGQRSREKMWVFGYGSLIWKVDFPYQDKLVGYITNYSRRF 
WQGSTDHRGVPGKPGRWTLVEDPAGCVWGVAYRLPVGKEEEVK 
AYLOFREKGGYRTTTVIFYPKDPTTKPFSVLLYIGTCDNPDYXG 
PAPLEDIAEQI FNAAGPSGRNTEYLFELANSIRNLVPKKADEHL 
FALEKLVKERLEGKQNLNCI 


5523 


3 


1280 


SKGKKRMGSSMSAATARRPVFDDKEDVNFDHFQ ILRA1GKGS FG 

J\vv_j. v WilK VI a jsJl xTuq KYMNKQQC I ERDEVRNVFkELE I IiQ B IE 

HVFLVNLW YS FQDEEDMFMWDLI»IK3GDLRYHLQQNVQFSED1V 
RL YI CEMALALD YLRGQHI I HRDVKPDNI LLDERGHAHLTD FNI 
ATI IKDGERATALSGTKPYMAPBIFHSFVNGGTG YSFEVDWWSV 
GVMAYELLRGWRPYDIHSSNAVESLVQLFSTVSVQYVPTWS KEM 
VALLRKEiLTVWPEHRLS S LQDVQAAPALAGVL WDHLSEKR VE PG 
FVPNKGRLHCD PT FE LEEMI LESRPLH KKKKRLAKNKSRDNS RD 
S SQSENDYLQDCLDA I QQD FVI FNREKLKRSQDLPPJEP LPAPES 
RDAAEPVEDEAERSALPMCGPICPSAGSG 


" 5524 


85 


2318 


RERERDHRPGESSQGQSGAGGCFPSPTMBLROGGLLFSSRFDSG 
NLAHVEKVESLSSDGEGVGGGASALTSGIASSPDYEFNVWTRPD 
CASTE FENGNRSWFYPS VRGGMPGKL I KINIMNMNKQS KLYSQG 
MAPFVRTLPTRPRWER I RDRPTFBMTETQFVLS FVHRFVEGRGA 
TTFFAFCYPFSYSDCQELLNQLDQRFPENHPTHSSPLDTIYYHR 
BLLCYSLDGLRVDLLTITSCHGLRSDREPRLEQLFPDTSTPRPF 
RFAGKRIFFLSSRVHPGETPSSFVFNGFLDFILRPDDPRAQTLR 

RLFVFKLIPWLNPDGWRGHYRTDSRGVNLNRQYLKPDAVLHPA 
IYGAKAVLIiYHHVHSRIiK*?oqQQPHrtDCJCr , T."DTjnii3\foriT dvkm 

NLQNEAQCGHSADRHNAEAWKQTEPAEQKLNSVWIMPQQSAGLE 
ESAPDTIPPKESGVAYYVDLHGHASKRGCFMYGNSFSDESTQVE 
NMLYPKL1SLNSAHFDFQGC3JFSEKNMYARDRRDGQSKEGSGRV 
A I YKASGI IHS YTLECN YNTGRfl VNS I PAACHDNGRAS PPPPPA 
FPSRYTVELFEQVGRAMAIAAIiDMAECNPWPRIVLSEHSSLTNL 
RAWMLKKVRNSRGLSSTLNVGVNKKRGLRTPPKSHNGLPVS CS E 
NTLSRARSFSTGTSAGGSSSSQQNSPQMKNSPSFPFHG3RPAGL 
PGLGSSTXJKVTK^VLGPVRGKPVWEPUJHVTGCI^ 


5525 _ 


105 


834 


SNTLDFERHLFIMGQQISDQTQLVINKLPElKVAKHVTLVRESGS 
LTYEE FLGRVAELNDVTAKVASGQEKHLLFEVQPGS DSS AFWKV 
VVRWCTKINKSSG1VEASRIMNLYQPIQLYJQDITSQAAGVLAQ 
SSTSEEPDENSSSVTSCQASLWMGRVKQLTDEEECCICMDGRAD 
LILPCAKSFCOKCIDKWSDRHRNCPICRLQMTGANESWWSDAP 
TEDDMAN Y I LNMADEAGQPHRP 


5525 


3 


853 


RRPCN P VRAAKRTGAAARA PRGLE VTMLR VA WRTLS LI RTRAVT 
QVLVPGLPGGGS AKFP FNQWGLQPRSLLLQAARGYWRKPAQS R 
LDDDPPPSTLLKDYQhTVPGIEKVDDWfQ^LSLEMAMKKBMLKI 
KQEQ FMKKI VANPEDTRSIiEARI IALS VKI RB YEHHLEKHRKD K 
AHKRYLLMSIDQRKKMLKin*RNTI^DVFEKICWGLGIEYTFPPL 
YYRRAHRRFVTKKALCIRVFQETQ10 J KKRRRAU<AAAAAQKQAK 
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Predicted end 
nucleotide 

cor re spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AeAlanine, OCysteine, p=Aspartic Acid, Bo 
Glutamic Acid, ^Phenylalanine, G»Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V^Valine, 
W=»Tryptophan, Y=Tyxosine, X*CTnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RRNPDSPAKAIPKTLKDSQ 


5527 


3225 


565 


IJiRXYl^HQNPI^LtoQPNRfClS^ATMKLKDTKSRPKQSSCg" 

KFQTKGIKWGKWKEVKIDPNMPADGQMDDLVCFEBLTDYQLVS 

PAKNPSSLPSKEAPKRKAQAVSEEEEEEEGKSSSPKKKIKLKKS 

KNVATEGTSTQFCE FEVKD PELEAQGDDMVCDDPEAGEMTS ENLV 

QTAPKKKKNKGKKGLEPSQSTAAKVPKKAKTWIPEVHDQKADVS 

AWKDLFVPRPVLRALS FliGFSAPTP IQALTLAPAIRDKLDIIiGA 

AETGSGKTLAFAIPMIHAVLOWQKRNAAPPPSNTEAPPGETRTE 

AGAKIT^PGKAEAESDALPEDTVIESEALPSDIAAEARAKTGGT 

VSDQALLPGDDDAGEGPS SL IRE KPVP KQNENEEEN LDKEQTGN 

LKQELDD KS ATCKAYPKR PLLGLVLTPTRELAVQVKQHIDAVAR 

PTGI KTAILVGGMSTQKQQRMLNRRPE I WATPGRLWELI KE KH 

YHLRNLRQLRCLVVDEADRMVEKGHPAEI»SQLLEMLNDSQYNPK 

RQTLVFSATLTLVHQAPARILHKKHTKKMDKTAKLDIiLMQKIGM 

RGKPKVXDLTRNEATVETLTE TK IHCETDEKDFYL YY FLMQYPG 

RSIjVFANSISCIKRLSGLLKVLDIMPLTLHACMHQKQRLRNLBQ 

FARL EDCVLLAT D VAARGL D I PKVQHVIHYQVPRTS E I YVHRSG 

RTARATNEGLSLMLIGPEDVINPKKIYKTLKKDEDIPLFPVQTK 

YMDWKERI RLARQ I E K5 E YRNFQACLHNS W I EQAAAALE I E LE 

EDMYKGGKADQQEERRRQKQMKVLKKELRKUiSQPLFTESQKTK 

YPTQSGKPPLLVSAPSKSBSALSCLSKQKKKKTKKPKEPQPEQP 

QPSTSAN 


5528 


3 


895 


GPFLSACRMWGACKVKVHDSLATIS ITLRRYLRLGATMAKSKFB " 
Y VR D FEAODTCLAHCW VWRLDGRNFHR FAE KHNFAKPNDS RAL 
QLM7KCAQTVMEELED 1 VI AYGQSDEYSFVFKRKTNWFKRRASK 
FMTHVASQFASSYVFYWRDYFEDQPLLYPPGFDGRVWYPSNQT 
LKD YLS WRQADCHINNL YNTVFWALI QQSGLTP VQAQGRLQGTL 
AADKNEILFSEFNIN YNNE PPMYRKGTVLI WQKVDEVMTKE I KL 
PTEMEGKKMAVTRTRTKPCKPSHLPRAPCLRWL 


5523 


48 


640 


tfrlvsahlktrklinpeaaerrwrdwdsrqgwlsvkMqrvsgl 
lswtlsrvlwlsglsepgaarqprimeekalevydlirtirdpe 
kpktle elewsescvevqeineee ylvi i rftp wphcslatl 
iglclrvklqrclpfkkkleiyisegthstbedinkqindkerv 
aaamenpnlreiveqcvlepd 


5530 


4541 


2606 


AQIVHAISYCHKIiHVGHRDLKPENWFFEKQGLVKL.TDFGFSNK~ 
FQPGKKLTTSCGSLAYSAPEILLGDEYDAPAVDIWSLGVILFML 
VCGQPPFQEANDSETLTWIMDCKYTVPSHVSKECKDLI TRMLQR 
DPKRRASLEEIENHPWLQGVDPSPATKYNIPLVSYKNLSEEEHN 
S 1 1 QRMVLGD IADRDAI VEALETTNRYNHITAT YFLLAERI LREK 
QEKEIQTRSAS PSNIKAQFRQS WPTK I D VPQDLEDDLTATPLSH 
ATVPQSPARAADSVLNGHRSiGGLCDSAKKDDLPELAGPAIiSTVP 
PASL KPTASGRKCLFRVEEDEEEDE EDKKPMSLSTQWLRRK PS 
VTNRLTSRICSAPVIiNQl FEEGESDDEFDMDENLPPKLSRIiKMNX 
ASPGTVHKRYHRRKSQGRGSSCSS5ETSDDDSESRRRLDKDSGF 
TYSWHRRDSSEGPPGSEGDGGGQSKPSNASGGVDKASPSENNAG 
GGS PSS GSGGNPTNTSG TTRRCAG PSNS MQLAS RS AGELVESLK 
LMSLCLGSQLHGSTKYIIDPQNGLSFSSVKVQEKSTWKMCISST 
GNAG QVPAVGG I KFFSDHMADTTTELER IKSKNLKNNVLQLPLC 
EKT ISVNIQRNPKEGLLCASS PASCCHVT 


5531 


24 " ' 


515 


GSQPRAPRPRDSWERPEPElilRQSWRAVSRSPLEHGTVLFARLF"" 
ALEPDUjPLFQYNCRQFSS PBDCTfiS PE FLDHIRKVMLVIDAAV 
TNVEDLSSLEEYIiAfiI^RKHILaV(WTfT.Q5RRTUrJPQT.T.VMT cvn 

IX5PAFTPATRAAWSQLYGAVVQAMSRGWDGE 


5532 " 


3395 


1402 


SDWKVVGKRKMIIEDETEFCGEELLHSVLQCKSVFDVLDGEEMR ' * 
RARTFJ^PYEMIRGVFFLNRAAMKMANMDFVFDRMFTNPRDSYG 
KPLVKDREAELL YFADVCAGPGGFS E YVLWRKKlf HAKGFGMTIi K 
GPNDFKLEDFYSAS SELFE PYYGEGG IDGDGDITRPENISAFRN 
FVLDNTDRKGVHFLMADGG FSVEGQENIjQE II»S KjQLLLGQFLMA 
LS IVRTGGHFICKT FDLFTP FS VGLVYLL YCCFERVCLFKP I TS 
RPANSERYWCKGLKVG IDDVRD YLFAVN I KLNQ LRNTDSDVNL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino sic id 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CoCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine. G=*Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L=Leucine, MoMethionine, N«Aeparagine, 
P«Proline, Q-Glutamine, R«Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=DnJcnown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








WPLEVIKGDHEFTDYMIRSNESHCSLQIKAIAKIHAFVQDTTL 
SBPRQAEIRKBCLRLWGIPDQARVAPSSSDPKSKPPELIQGTEI 
DI PSYKPTLLTS KTLEKIRPVFDYRCMVSG3EQKFLIGLGKSQI 
YTWDGRQSDRWI KLDLKTELPRDTLLS VE I VHELKGEGKAQRKI 
SAIHILDVLVLNGTDVREQHFNQRIQIAEKFVKAVSKP53RPDMN 
PIRVKEVYRLBEMEKIPVRLSviKI I KGSSGTPKLSYTGRDDRHP 
VPMGLYIVRTVWEPWTMGPSKSPKKKFFYNKKTKD3TFDLPADS 
IAPFHI CYYGRL FWEWGDG IRVHDS QKPQDQDKLS KEDVLS FIQ 
MHRA 


S533 


94 


7B9 


MKERRAPQP WARCKLVLVGDVQCG KTAMLQVIiAKDCYPETY VP 
rVFENYTACLBTEBQRVELSLWDTSGS P YYDNVR P LCYS DSDAV 
LLCFDI SRPBTVDSALKKWRTE I LD YCPSTRVLLI GCKTDLRTD 
LSTLMELSHQKQAPISYEQGCAXAKQLGPEIYLEGSAFTSEKSI 
HS I FRTASMLOuNKPSPLPQKSPVRSIiSKRLLHLPSRSELI SPT 
FKKBKAKXCSIM 


"5534 


3 


605 


LVRGRARAANPGRVaAMDGLRQRVEHFLEQRNLVTEVLGALEAK 
TG\^KRYLAAGAVTLLSLYLriFGYGASLLCNLIGFVYPAYASIK 
AIES PSKDDDTVWLTYWVVYAIiFGIiAEFFSDIiLLS WFPFYYVGK 
CAFLLFCMAPRPWNGALMLYQRVVRPLFLRHHGAVDRIMNDLSG 
RALDAAAGITRNVKPSQTPQPKDK 


5535 


1029 


332 


KSFMDSEARLCSLVELSDTQDETQKSDSENEDLKIDCLQESQEL 
NLQKLKNSEllILTKAKQKMREIiTVNIKMKEDLIKELIKTGNDAK 
SVSKQYTLKVTIO^EHDAEQAICVELTETQKQLQELENKDLSDVAM 
KVKLOKE FRKKVDAAKLRVQVLQKKQQDSKKIiASLSIQNE KRAN 
ELEQS VDHMKYQKIQIX3RKLQEEWBKRKQLDAV I KRDQQKI KVI 
LSYIPAKYNMKC 1 


5536 


942 


282 


aaataaslsprgcrlrtpssdVspsrapppsaaplptgraqmsp 

SGRLCLLTI VGLI LPTRGQTliKDTTS SSSADAT I MDIQVPTRAJ? 
DAVYTELQPTSPTPTWFADETPQPOTQTQQIiEGTDGPLVTDPET 
HKSTKAAHPTDOTTTLSERPSPSTDVQTDPQTLKPSGFHBDDPF 
FYDEHTLRKRGLLVAAVLFITGI IILTjSGKCRQLSRLCRNHCR 


5537 

.i 


3 

t 


2391 


RARVSSPQLRVFRSGRPRRIiRVLRINRTSVAliRIiAGTGRFVAKT 
PGHPGSWEMGLLTFRDVAVEFSLEEWEHLEPAQKNLYQDVMLEN 
YRNLVSLGIiWSKPDLITfLEQRKEPWNVKSEETVAIQPDVFSH 
YNKDLLTEHCTE AS FQKVI SRRHGS CDLENLHLRKRWKREECEG 
HNGCYDERTFKYDQFDBSSVESLFHQQILSSCAKSYNFDQYRKV 
FTHSSLLNQQEEIDIWGKHHIYDKTSVIiFRQVSTLNSYRNVFIG 
EXNYHCMNS EKTLNQSSSPKNHQENYFLEKQYKCKEF2EVFLQS 
MHGQEKQEQSYKCNKCVBVCTQS LKH I QHQTIHI RENS YSYNKY 
DKDLSQS SNbRKQI I HNEEXP YKCEKCGDSLNHS LHI/TQHQ IIP 
TEEKPYKWKECGKVFNLNCSLYLTKQQQIDTGENLYKCKACS KS 
FTRSSNL I VHQRIHTGEKPYKCKECGKAFRCSSYIiTKHKRIHTG 
EKPYKCKECGKAFNRSSOiTQHQTTHTGEKLYKCiCVCSKSYARS 
SlHilMHQRVHTGBBCPYKCKBCGKVFSRSSCLTQHRKIHTGBNLY 
KCKVCAKPFTCFSNLIVHERIHTGEKPYKCKECGKAFPYSSHLI 
RHHRIHTGEKPYKCKACSKSFSDSSGLTVHRRTHTGEKPYTCKE 
CGKAFSYSSDVIQHRRIHTGQRPYKCEECGKAFNYRSYLTTHQR 
SHTGERP YKCE E CGKAFNSRS YLTTHRRRHTGERP YXCDECGKA 
FS YRS YLTTHRRS HSGERPYKCEECGKAFNSRS YL IAHQR3HTR 
BKL 


5538 


926 


161 


HSMMMKIPWGSIPVIjMLIjIjLLGLIDISQAQIjSCTGPPAIPGIPG ' 

I PGTPGPDGQPGTPGIKGEKGLPGLAGDHGEFGEKGDPGIPGN P 

GKVGPKGPMGPKGGPGAPGAPGPKGBSGDYKATQKIAFSATRTI 

NVPLRRDQTIRFDHVTTNMNN^EPRSGKFTCfCVPGLYYFTYHA 

SSRGNLCVNIiMRGRERAQKVVTFCDYAYirrFQVT^GGMVLKLEQ 

GENVFLQATDKNSLLGMEGANSIFSGFLLFPDMEA 


"15JT" 


38 


1258 


HRGPSGAAAPGCALPRGQAliEGPRSCRRPQPMARRYDELPHYPG " 
I VDGPAALAS FPETVPAVPGP YGPHRPPQPLP PGIiDS DG LKREK 
DEIYGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCS 
SDSFNEDIAAFAKQVRSERPLFS SNPELDNliVTQAl QVLRFHLL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cys teine, D=Aspartic Acid, E« 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H*Histidine, I«Isoleucine, K«Lysine, 
L»Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glut amine, R^Arginine, 
SsSerine, T=Threonine, V=valine, 
WeTryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
^possible nucleotide insertion) 








BLE KVHDLCDN FCHR Y I T CLKGKMPI DL VI EDRDGGCREDFED Y 
PASCPSLPDQNNMWIRDHEDSGSVHLGTPGPSSGGLASQSGDNS 

SDQGDGLDTSVAS pssggededldqerrrnkkrgifpkvatnim 

RAWLFQHLSHPYPSEEQKKQIAQDTGLTII^VNNWFINARRRIV : 
QPM I DQSNRTGQ GAAPS P EGQP IGG YTETQ PHVAVR P PGS VGMS 
LNLBGEWHYL | 


5540 


148 


1440 


pplgagagvharsphparrlplttagvggrapdllptpwrqhrg 

PSGAAAPGCALPRGQALBGPRSCRRPQPMARR YDELPHYPG IVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKRBKDEI 
YGHPLFPLLALVFEKCELATCS PRDGAGAGLGTPPGGDVCSSDS 
FNEDNTAFAKQVRS3RPLFSSNPELDNLMIQA1QVLRFHLLELE 
KGKMPJDLVIEDRDGGCREDFEDYPASCPSLPDQmi^lRDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSYASPSSGGED 
EDLDQEPRRNKKRG t FPKVATNIMRAWLFQHLSHPYPSEEQKKQ 
IAQDTGLTILQVNNWFINARRRIVQPMIDQSNRTGQGAAPSPEG 
Q PIGG YTETE PHVAFRAPAS VG DEFGTR KEEWHYL 1 


5541 ■ 


143 


1440 


PPLGAGAGVHARS PHPARRIiPTVTT Afit/ntftB a Dnf .V t>t 5Hd7w57: — 1 
PSGAAAPGCALPRGQALEG PRSCRRPQPMARRYDELPKYPG IVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDE I 1 
YGHPLFPLLALVFBKCELATC S PRDGAGAGLGTPPGGDVCS SD S 
FKEDNTAFAKQVR5ERPLFSSNPELDNIiMIQAI0VLRFHLLEIjE 
KGKMPIDLVIEDRDGGCREDFED YPASCPS LPDQNNIWIRDHED 
SGS VHLGTPGPSSGG LASQSGDNS SDQGVG LOTSVAS PSSGGED 
EDLDQEPRRNKKRG I FPKVATNIMRAWLFQHLSHPYPSEEQKKQ 
IAQDTGLTILQVNNWFINAKR R I VQPMI DQSNRTCQGAAFS PEG 
QPIGGYTET3PHVAFRAPASVGDEFGTRKEEWHYL | 


5542 

r 


14 8 


1440 


PPLGAGAGVHARSPHPARRIjPIiTTACTVnrtPapnT.T d 'row d rs u dTt — 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDB I 
YGH P LF P LLAL VFE KCELATCS PRDGAGAGLGTP PGGD VCS S D S 
FNEDNTAFAKQVRSERPLFSSNP ELDNLM IQ AIQVLRFHLLELE 
KGKMPIDLVIEDRDGGCRBDFEDYPASCPSLPDQNNIWIRDHBD I 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTS VAS PSSGGED 
EDLOQEPRRNKKRGI F P KVATNIMRAWLFQHLSHP YPSBEQKXQ 
IAQDTGLT1LQVNNWF INARRRI VQPMIDQSNRTGQGAAF3 PEG 
QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYL | 


5543 
55-44 " 


2405 


665 


RWVREQPWPLRTSEAVKTPALRP'FPGPRGVSPFPKPDWGKSPApH 
KR P FS DSGAFWS PERRPG VLEAP RRR PVPAS FRA VP P KPTR VHG 
SSASRDRVLARTMIVADSECRAELKDYLRFAPGGVGDSGPGEEQ 
KBS RARRGPRG PS AFIP VEEVLREGAESLEQHIiGLEALMSS GRV 
DNLAWMGLHP D YFTS FWRLHYL LLHTDG PLAS SWRHY I A I MAA 
ARHQCSYLVGSHMAEFI^TGGDPEWIiLGLHRAPEKLRKIiSEI^ 
LlJ^HRPWLITKEHlOAIiLKTGEHTWSLABLlQAXiVIiLTHCHSLS 
S FVFGCGILPEGDADGS PAPQAPTPPSEQSSPPSRDPLNNSGGF 
ESARDVEALMERMQQLQESLLRDEGTSQEEMESRFELBKSESLL 
VTPSADILBPSPHPDMLCFVEDPTFGYBDFTRRGAQAPPTFRAQ 
DYTWEDHGYSLIQRLYPEGGQLLDEKFOAAYSLTYNTIAMHSGV 
DTS VLRRAIWN YI HCVFG IRYDD YD YG EVNQLLERNLKVY I KTV 
ACYPEKTTRRM YNLFWRHFRHS EKVHVNIJjLLEARMQAALL YAL 
RAITRYMT 




1895 


514 


LrGGLLGRQRLIiLRMGAGRLGAPME RHGRAS ATS VSS AGE QAAGD I 
PEGRRQEPLRRRASSASVPAVGASAEGTRRDRLGSYSGPTSVSR 
QRVES LRKKR PL FPWFGLD IGGTLVKLVY FEPKD1 TAEEEEEEV 
ESLKS I RKYLTSNVAYG S TG I RDVKLE LKDLTL CG RKGNLHF IR 
FPTHDMPAFIQMGRDKNFSSLHTVFCATGGGAYKFEQDFLTIGD 
LQLCKLDEIiDCLIKGILYIDSVGFNGSSQCYYFENPADSEKCQK 
LPFDLKN P YPLLLVN IGSGV S I LAVYS KDN YKRVTGTSLGGGTF 
FGLCCLLTGCTTFEEALEMASRGDSTKVDKLVHDl YGGDYBR FG 
LPGWAVAS S FGNMMS KEKREAVS KEDLARATL IT ITNNIGS 1AR 
WCALNENINQWPVGKFIJWNTIA^!RLIiAYALD YWS KGQLKALF 
3EHEGYFGAVGALLELLKIP j 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AeAlanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G*Glycin&, 
H-Hiatidine, I-Iaoleucine, KaLyeine, 
L»Leucine, M -Methionine, N=Asparagine, 
PaProline, Q=Glutamine , R=Arginine, 
SaSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyroslne, X°Onknown, *3Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5545 


802 


131 


GAMWSAGRGGAAWPVLLGLLLALLVPGGGAAKTGAELVTCQSVL 
KLLNTHHRVRLHSIIDI KYGSGSGQQSVTGVEASDDANS YWRIRG 
GS EGG C PROS PVRCGQAVRLTHVLTGKNLHTHHF PSP LS NNQEV 
SAFGEDGEGDDLDI^TVRCSGQHWEREAAVRFQHVGTSVPIjSVT 
GEQYGS PIRGQHEVHGM PSANTHNTWKAMEGI P I K PS VE PSAGH 
DHL 


" 554^ 


" "1532 


146 


FVP RGGHSS MGQSGRS RHQKRARAQAQLRNLEAYAANPHS PVPT 
RGCTGRKIRQI*SI»D VRRVMEPLTASRLQVRK KJIS LKDCVAVAGP 
LGVTHFLILSKTETNVYFKLMRLPGGPTLTFQVKKYSLVRDVVS 
SLRRHRMHEQQFAHPPLLVLNSFGPHGMHVKLKATMFQNLFPSI 
NVHKVNLNT I KR CLliIDYNPDSQELDFRHYS 1 KWP VGASRGMK 
KLLQEKFPNMSRLQDISELLATGAGLSESEAEPDGDHNITELPQ 
AVAGRGNMRAQQSAVRLTEIGPRMTLQLIKVQEGVGEGKVMFHS 
FVS KTE EELQAI L E AKE KKLRLKAQRQAQQAQNVQRKQE QRRAH 
RKKShEGMKKARVGGSDEEASQIPSRTASLBLGEDDDEQEDDDl 
EYFCQAVGEAP SEDLFP EAKQ KRIiAKS PGRKRXRWEMDRGRGRL 
CDQKFP KTKDKSQGAQARRGPRGAS ROGGRGRGRGRPGKRVA 


5547 


1592 


146 


FVPRGGHSSMGQSGRSRHQKRARAQAQLRNIiEAYAAN^ilslpVFT 
RGCTGRNIROLS LDVRRVMEP LTAfi RLO UR K" RW<?T .tmrva \rnn d 

LGV7^FLILSKTETWWPKLMRLPGGPTLTFQVKKYSLVRDWS 
SLRRHRMHEQQ FAHPPLLVLHSFGPHGMHVKL^TMFQNLFPS I 
NVHKVNUJTIKRCLLIDYNPDSQELDFRIfirSlKVVPVGASRGMK 
KLLQEKFPNMSRLQDlSELLATGAGLSESEAEPDGDHNITEtiPQ 
AVAGRGNMRAQQSA VRLTE I GPRMTLQXjIKVQEGVGEGKVMFHS 
FVS KTEEELQAI LEAKE KKLRLKAQRQ AQQAQWVQRKQEQREAH 
RKKS LEGMKKARVGGSD EEASG 1 PSRTAS LELG EODDEQEDDD X 
EYFCQAVGE APSEDLFPEAKQKRLAKS PGRKRKRWEMDRGRGRL 
CDQKFPKrKDKSQGAQARRGPRGASRDGGRGRGRGRPGKRVA 


" 5548 


1 


2153 
* 


DQTG P PETXAFT FPRSTMEP LCPLIiLVGFSLPLARAIjRGNETTA 
DSNETTTTSGP PDPGASQPLIAWLLLPIiLLLliLVLLLAAYFFRP 
RKQRKAWSTSDKKMPNG I LEEQEQQRVMLLS RSPSGP KKYFPX 
PVEHLE EE IR1 RS ADDCKQFREB FNSLPSGH IQGTFELANKE EN 
REKNRyPNILPITOHSRVII^OLDGIPCSDYIKASYIDGYKBXNK 
F IAAQG P KQE TVND FWRMVW EQKSAT I VMLTNL KE RKEE KCHQY 
WPDQG CTm r GNIRVCVEDCVVLVDYTIRKPCIQPQLPDGCKAPR 
LVS QLHFTSWPD FGVP FT P IGMLKFLKKVKTLNPVHAGPIVVHC 
SAGVGRrGTFIVIDAMMAMMHAEQKVXJVFEFVSRrRNQRPQMVQ 
TDMQYTFIYQALLEYYLYGDTELDVSSLEKHLG/rMHGTTTHFDK 
I GLEEEFRKLTNVR I MKENMRTGNLP ANMKKARVIQI I PYDFNR 
VILSMKRGQEYTDYINASFIDGYRQKDYFIATQGPLAHTVKDFW 
RMIWEWXSHTIVMLTEVQEREQDKCYQYWPTEGSVTHGEITIEI 
KNDTLSEAISIRDFLVTLNQPQARQEEQVRVVRQFHFHGWPEIG 
I PAEGKGMXDLI AAVQKQQQQTGNHP ITVHCSAGAGRTGTF IAL 
SNILERVKAEGLLDVPQAVKSLRLQRPHMVQTLEOYEFCYKVVQ 
DFIDIFSDYANFK 


5549 


915 


256 


FEATGGKRLAFKMAGTARHDREMAIQAKKKIiTTATDPIERIjRLQ 
CLARGSAGIKGLGRVFRIMDDDMNRTLDFKEFMKGLNDYAWME 
KEEVEELFQRFDKDGNGTI DFNEFLLTLRPPMSRARKEVIMQAP 
RKLD KTGDGV I T I EDLRE VYNAKHHPKYQNGE WS E EQ VFRXFLD 
NFDSPYDKDGLVTPEEFMNYYAGVSAS IDTDVYFI IMMRTAWKL 


5550 


2364 


1210 


RKR KV FLKMRRLNRKKTLS L VKELDAFP KVPES YVETS ASGGT V 
SLIAFTTMALLTI MEFSVYQDTWMKYEYEVDKDFSS KLRIN I DI 
TVAMKCQYVGADVLDrjAETWASADGLVYEPTVFDJtSPQQKENQ 
RMLQL I QS RLQEEHS LQD VI FKS AFKSTSTALP PREDDSS QS PN 
ACRIHGHLVWKVAGNFKITVGiCAIPHPRGHAHLAALVNHESYN 
FSHRI DHLS FGBLVPAI INPLDGTEKI AIDHNQMFQYFITWPT 
KLHTYKISADTHQFS VTERERI INHAAG3HGVSGI FMKYDLSSL 
MVTVTEEHMPF^QFFVRIjCGIVGGIFSTTGMLHGIGKFIVEI IC 
CRFRLGSYKPVNSVPFEDGHTDNHLPLLEKNTH 


5551 


211 " 


1700 " 


MQRDHTMDYKESCPSVSIPSSDEHREKKKRFTVYKVLVSVGRSE 
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ID 

NO: 


1 Predicted 
beginning 

[ nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

1 sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine, C=Cysteine, D»Aspartic Acid, 3= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=*Histidine, I=Isoleucine, KoLysine, 
L=Leucine, M=Methionine, NuAsparagine, 
PsProline, Q^Glut amine, RoArginine, 
S^Serine, T- Threonine, VaValine, 
WcTryptophan, Y-Tyrosine, X^Unknown, *:=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








wfvfrryaefdklyntlkkqfpamalIcj! pakrIpgdnfdpTdfik 

QR RAGLNE FIQNLVR YPE LYNHPDVRAFLQMDSPKHQSDPSE DB 
DERSSQKLHSTSQNINLGPSGNPHAKPTDFDFLKVTGKGSPGKV 
LLAK RKLDGKF YA VKVLQ KKI VLNRKEQXH IMAERNVLIjKNVKH 
PFLVGLHYSFOTTEKLYFVLDFVNGGELPPHLQRERSFPEHRAR 
FYAAE IASALGYLHS I KI VYRDLKPBNILLnSVGHWLTDFGLC 

E ML YG LP P FYCRDVAEMYDN I LHKPLSLR PGVSLTAWS ILE ELL 
EKDRQNRLGAKEDFLEIQNHPPFESIjSWADLVQKKIPPPPNPNV 
AGPDDI RNFDTAFTEETVP YSVCVSSDYS IVNAS VLEADDAFVG 
FSYAPPSEDLFL 




2746 


930 


LGPAAGAAMQ KKHK KHKAE WRS S YEDYADKPIiEK PLKLVLKVGG 
SEVTELSGSGHDSSYYDDRSDHERERHKEKKKKKKKKSEKEKHL 
DDEERRKRKEEKKRKREREH CDTEGEADDFDPG KXVEVE PPPDR 
PVRACRTQPAENESTPIQQLLEHPLRQLQRKDPHGFFAPPVTDA 
I APGYSMI I KHPMDFGTMKDKI VANEYKS VTEFKADFKLMCDNA 
MTYNRPDTVYYKLAKKI LHAG FKMMS KOAALLGMHiyrAVPP pvo 
EWPVQVETAKKSKKPSREVISCMFEPEGNACSLTDSrAEEHVL 
ALVEHAADEARDRINRFLTOGKMGYLKRNGDGSLIiYSVVNTAEP 
DADEEETHPVDLSSLSSKLLPGFTTLGFKDERRNKVTFLSSATT 
ALSMQmSVFGDLKSDEMELLYSAYGDBTGVQCALSLQEFVKIlA 
GS YS KKVWDLLDQI TGGDHSRTL FQLKQRRNVPMKP PDSAKVG 
DTLGDSS S SVLEPKSMKS YPDVS VD Z SMLS S LGKVKKELDPDDS 
HLNLDETTKLLQDLHEAQAERGGSRPSSNLSSLSNASERDOHHL 
GS P S R LS VGEQPDVTHD PYE FLQS PE PAAS AKT 


5553 


74 


1095 


LGREAVYLVSRMDGPVAEHAKQEPFHWTPLLESWALSQVAGWP " 
VFLKCTfVQPSGSFKIRGIGHFCQEMAKKGCRHIjVCSSGGNAGI 
AAA YAAR KLG I P ATI VL PESTS LQ VVQRLQGEGAE VQLTGfCVWD 
EANLRAQELAKRDGWENVPPFDHPLIWKGHASLVQELKAVLRTP 
PGALVLAVGGGGLLAG WAGIXEVGWQHVPI IAMETHGAHCFNA 
AITAGKLVTL PD1 TSVAKS LG AKTVAARALE CHQVCK3HS EWE 
DTEAVS AVQQLLDDERMLVEPACGAALAAI YSGLLRRLQAEGCL 
P PS L T SWVI VCGGNN I NS RELQ ALKTHLGQ V 




166 


2318 


CSGRTGGRGSLR PAENV CLTCK LSGAETKGLLC PALRTWIMK VL 
GRS FF WVLF PVL PWAVQAVEHEEVAQRVI KLHRGRGVAAMQS RQ 
WVRDS CRKLSGLL RQKNAVLNKL KTA I GA VEKD VG LSD E EKL FQ 
VHTFE IFQKF.LireSENSVFQAWGLQRALQGDYKDVVNMKESSR 
QRLEALREAAI KE ETEYMBLIAAEKHQVEALKNMQHCJNOSliSML 
DE I LE D VRKAADRLEEB IEEHAFDDNKSVKG VKFE AVLRVEE BE 
ANS KQN I TKREVEDDLGLSMLIDSQNNQ YILTKPRDSTI PRADH 
HFIKDIVTIGMLSLPCGWLCTAIGLPTMFGYIICGVLLGPSGLN 
SIRS I VQVETLGEFGWFTliFLVGLEFSPEKLRiCVWKISIiQGPC 
YMTLLMI AFGLLWGHLLRI KPTQSVFI STCLSLSSTPLVSRFLM 
GSARGDKEGDIDYSTVLliGf^VTQDVQLGLFKAVMPTLIQAGAS 
ASSSIWEVLRILVLIGQILFSLAAVFLLCLVIKKYLIGPYYRK 
LHMESKGNKEILILGISAFIFLMLTVTELLDVSMELGCFTAGAL 

VSSQGPWTEE iats iepi rdfiai vffas iglhvfptpvayel 

TVLVPLTLSVWMKFLIiAALVLSLILPRSSQYIKMrVSAGLAOV 

sefspvi^srarragvisrevyllilsvttlslllapvlwraai 
trcvprperrssl 


5555 
55$* "T 


212 
*B3S 


1425 
3346 


LSUiTRETPAPPRCEAASOGRVGWRADAAAEEAVRSVWNRTRDR 
GTMAPOWLSTFCLLLLYLIGAVIAGRD FYKILG VPRSAS IKDI K 
KAYRKLALQ LHPDRNPDDPQAQ EKFQD LGAAYEVLS DSEKRKQY 
DTYGEEGLKDGHQSSHGDIFSHFFGDFGFMFGGTPRQQDRNIPR 
GSDI IVDLEVTLEEVYAGNFVEVVRNKPVARQAPG KRKCNCRQE 
MRTTQLGPGRFQMTQE\^DECPm^CLVNEERTLEVEIEPGVRD 
GMEYP F IGEGEPHVDGEPGDLRFRIKWKHPI FERRGDDLYTNV 
TI SLVESLVGFEWDITHLDGHiCVHISKJ5KITRPGyiKLWfG<GEGL 
PNFDNNNIKGSLI ITFDVDFPKEQLTEEAREGIKQLLKQGSVQK 
imiGLQGY 

ITRGMSKNCVPMEFEEYLLRMFQGTFYLLOKITKDNNAHTVKSR 
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SEQ 
ID 
NO: 


Predicted 
beg inning 
nucleotide 
location, 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(AsAlanine, C=Cysteine, D=*Aspartic Acid, E« 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, tf«Asparagine, 
P=Proline, Q»Glutamine, R*Arginine, 
S-Serine, ToThreonine, V^Valine, 
W=Tryptophan, Y= Tyro sine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEELDESYIEKFTDPIJiLFVSVHLRftlBSV^Ofe'PV^/E^TLtFK 
YTFHQPTHEGYFSCIJDIWTLFLDYLTSKIKSRLGDKEAVIjNRVE 
DAL VLLLTE VLNR I Q FR YNQ AQLE ELD DET LD DDQQTE WQRYLR 
QSLEWAKVl^IJiPTHAPSTLPPVI^DNLEVVljGIiQQFIVTSGS 
GHRZ^ITAEl^OTU^CSU^LSSLU3AVGiU^AEYPIGDVFAAR 
FND ALTW E RL VKVTLYG SQ I KLYNI ETAVP S VLKPDL I DVHAQ 
SLAALQAY SH W LAQYCSE VHRQNTQQFVTL I STTMDAITPLIST 
KVQDiOiLLSACHLLVSLATTVRPVFLISIPAVQKVPNRITDASA 
LRL VDKAQ VLVCRALS N I LLLPWPNLPENEQQWPVRSINHASLI 
SALSRDYRNLKPSAVAPQRKMPLDDTKLI IHQTLSVLBDIVENI 
SGESTKSRQICYQSLQESVQVSLALFPAFIHQSDVTDBMLSFFL 
TLFRGLRVQMGVPFTEQIIQTFLNMFTREQLAESILHEGSTGCR 
WEKFLKILQVWQEPGQVFKPFLPS I IALCMEQVYP 1 1 AERPS 
PDVKAELFELLFRTIJfflNWRYFFKSTVLASVQRGIABEQMENEP 
GFSAIMQAFGQSFLQPDIHIiFKQNLFYLETLNTKQKLYHKKIFR 
TAMLFQFVNVLLQVLVHKSHDLLQEEIGIATYNMASVDFDGFFA 
AFLPEFLTS CDGViy^NQKSVLGRNPKMDRVRRERGRAKRRABWA 
RKPGTCAARRGHIEASGRGLCPPCSLAAAHEMPADLVL 


" 55S7 


1712 


491 


VI LGAGIiRDKDMWI PVVGLPRRLRLS ALAQAGRFCILGSEAATR ' 
KHLPARNHCGLSDSSPQLWPBPDFRNPPRKASKASLDFKRYVTD 
RRLAETLAQIYLGKPSRP PHLLLBCNPGPGILTQALLEAGAKW 
ALESDKTFI PHLESLGKNIiDGKLRVIHCDFFKLDPRSGGVI KPP 
AMSSRGLFKNLGIEAVPWTADI PLXWGMFPSRGEKRALWKLAY 
DLYSCTSIYKFGRIEVNMFIGEKEFQKLMADPGNPDLYHVLSVI 
WQLACEIKVLHMEPWSSFDIYTRKGPLENPKRRELLDQLQQKLY 
LZ QMI PRQNIiFTKNZjTPMNYNI FFHLLXHCFGRRS ATVIDHLRS 
LTPLDARDILMQIGKQEDEKVVNMHPQDFKTIiFETIERSKDCAY 
KWLYDETLEDR 


5558 


1509 


96 

r 


RAGCTHPQVPADLGAPAE PRRPQK^CVCLLQPQPGGQRG PTTMl 
TGWSMRLWTPVGVLTSLAYCLHQRRVALAELQEADGQCPVDRS 
LLKLKMVQWFRHGARSPLKPLPLEEQVEWNPQLLEVPPQTQFD 
YTVTNLAGGPKPYSPYDSQYHETTLKGGMFAGQLTKVGMQQMFA 
LGERLRKNYVEDIPFLSPTFNPQEVFIRSTNIFRNLESTRCLLA 
GIjFQCQKEGPIIIHTDEADSEVLVPNYQSCWSLRQRTRGRRQTA 
SI^PGISBDLKKVKDRMGIDSSDKVDFFILLDNVAAEQAHNLPS 
CPMLKRFARMIEQRAVDTSLYILPKEDRESLQMAVGPFLHILBS 
NLLKAMDSATAPDKIRKLYLYAAHDVTFIPLLMTLGIFDHKW Pp 
FAVDLTMELYQHLESKBWFVQLYYHGKEQVPRGCPDGLCPLDMF 
LNAMSVYTLSPEKYHALCSQTQVMBVGNEE 


5559 


150 


1983 


PLAATAHFAKM3RVAXYRRQVSEDPDIDSLLETLSPEEMEELEK 
BlDVVDPDGSVPVGLRQRNQTEKQSTGVYNREAMLNFCEKETfCK 
LMQREMSMDESKQVETKTDAKNGEERGRDASKKALGPRRDSDLG 
KEP KRGGL KKS FS RDRDEAGGKSGEKP KEEKI I RG IDKGRVRAA 
VDKKEAGKDGRGEERAVATKKEEEKKGSI1RNTGLSRDKDKKRSE 
MKBVAKKEDDEKVKGERRin'DTRKEGE KMKRAGGNTDMKKBDEK 
VKRGTGNTDTKKDDEKVKKNEPLHEKEAKDDS KTKTPEKQTPSG 
PTKPSEGPAKVEEEAAPSIFDEPLERVKNNDPBMTSVNVNNSDC 
ITNEILWFTEALEFNTVVKLFAIiANTRADDHVAFAIAIMLXAN 
KT I TS LN LDSNH I TG KG I LAI FRAHjQNNTLTELRFHNQRHI CG 
GKTEl^IAKLLKElTrrLLKIiGYHFEI^G 

rqkrlqeqroaqbakgekkdllevpkagavakgspkpspqpspk 

PSPKNS PKKGGAPAAPPPPPPPLAP PL IMENLKNSLSPATQRKM 
GDKVLPAQEKNSRDQLLAAI rssnlkqlkkvbvpkllq 


55SO 


9 


921 


ssnarefsalsvsmaclspsqlqkpqqdgflvleqplsaeecvam " 
qqrige ivaemdvplhcrtefstoeeeqlraqgstdyflssgdk 
i rp ffb kgvfdekgnfl vppeks inkighalhahdpvfks i ths 
fkvotlarslglqmpwvqs myifkqphfggevs phqdas plyt 
EPLGRVLGVWIAVEDATLENGCLWFI pgshtsgvsrrmvrapvg 
S APGTS FLGSEPARDNSL F VPTPVQRGALVLIHG EWHKSKQNL 
SDRSRQAYTFHLMEASGTTWSPENWLQPTAELPFPQLYT 


5551 


2175 " 


1775 


C YF I FQFFSSP Y PGLHPHQT PAPLPNPGLYPPPVS MS PGQP PPQ | 
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ID 

NO: 


Predicbed 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A»Alanine, C*Cysteine, DoAspartic Acid, E» 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H^Histidine, I-Isoleucine, X«Lysine, 
L=Leucine, Methionine, N«Asparagine , 
P* Proline, Q«Glutamine, R=Arginine, 
S^Serine, T«Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X ^Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








qllaptypsapgvknpgnp8ypyapgalpppppphlypntqaps 

qvyggvtyynpaqqqvqpkpspprrtpqpvt:kppppewsrgs 
s 


5562 


342 


1385 


SSGKNDMAAAGAAGLVRGLKAGVLSQADYLNLVQC^EDLKXg 
LQS TDYGN FLANEAS PLTVS V I DDRLKEKMWE FRHMRNHAYEP 
LAS PIjD PITYS YMIDNVI LLITGTLHQRS I AELV P KCH PLGS FE 
QMEAVNIAQTPABLYNAIIiVDTPLAAFPQDCISEQDLDEMNIEI 
IRNTLYKAYI^SPYKFCTLLGGTTADAMCPILEFEADRRAFI IT 
INS FGTELS KEDRAKLPPHCGRLYPEGLAQLARADDYEQVKNVA 
DYYPEYKLLFEGAGSNPGDKTLEDRFFBHEVKlaNKLAFLNQFHF 
GVFYAFVKLKEOECRNIVWIAECIAQRHRAKIDNYIPIF 


5563 


342 


1385 


SSGKNDMAAAG^GLVRGhKAGVLSQADYLNLVQCETLEDLKLH 
LQSTDYGNFliANEASPLTVSVlDDRLKEKMVA^FRHMRNHAYEP 
LAS FLDFI TYS YMIDNVI LLI TGTLHQRSIABLV PKCI I PLGS FE 
QMEAVNIAQTPAELYNAILVDTPLAAFFQDCISEQDLDBMNIEI 
IRNTL YKAYLES FYKFCTLLGGTTADAM CP I LB FEADRRAF I IT 
INS FGTELSKEDRAlQjFPHCXaRLYPEGIiAQLAIUu^DYEQVKNVA 
DYYPE YKLLFEGAGSNPGD KTLEDR FF EHEVKLNKLAFLNQ FHF 
GVFYAFVKLKEQECRNIVWIAECIAQRHRAKIDNYIPIF 


5564 


3 


914 


ft VRRDKRAVWTARGRRR CGDSMSGG WMAQVGAWRTGALGLALLL 
lilXSLGLGLEAAASPLSTPTSAQAAGPSSGSCPPTKFQCRTSGLC 
VPLTWRCDRDLDCSDGSDEEECRIEPCTQKGQCPPPPGLPCPCT 
GVSDCSGGTDKKLRNCSRLACLAGELRCTLSDDCIPLTWRCDGH 
PIX2PDSSDEIjGCGTNEILPEGDATTMGPPVTLESVTSLRKATTM 
GPPVTLES VPS VGNATSSSAGDQSGS PTAYG V I AAAAVLS AS LV 
TATLLLLSWLRAQERLRPLGLLVAMKESLLLSEQKTSLP 


5565 


993 


138 


RWNSPNPARAGS I SRPQRAPGS VSAVAMTAAVPFGCAFIAFGPA 
LALYVFT I ATB PLRI IFLI AGAFFWL VSLLI SSLVWFMAR VI I D 
NKDGPTQKYLLI FGAFVSVYIQEMFRFAYYKLLKKASEGLKS IN 
PGETAPSMRLLAYVSGLGFGrMSaVFSFVNTLSDSLGPGTVGrH 
GDSPQFPLYSAFMTLVI ILLHVFWG I VFFDGCEKKKWGILL I VL 
LTHLLVS AQTF I S S YYG INLASAEIILVLMGTWAFLAAGGSCRS 
LKLCLLCQDKNFLLYNQRSR 


556* 


2043 


1232" 


SHIQHHGRGAQAPVKMVS WM I SRAW&VFGMLYPAY YSYKAVKT 
KNVKE YVRWMMYWI VFALYTVI ETVADQTVAWFPLYYELKI AFV 
I WLLS P YTKGASLI YRKFLH PLLSS KEREI DDY I VQAKERGYB T 
MVNFGRQGliNLAATAAVTAAVKSQGAI TERLRSFSMHDLTTIQG 
DBPVGQRPYQPLPEAKKKSKPAPSESAGYGIPLKDGDBfCTDEEA 
EGPYSDNEMLTHKGPRRSQSMKSVKTTKGRKEVRYGSLKYKVKX 
RPQVYF 


5567 

recD — " 


1554 


233 


EFKlSGVSPDLANEDGLTAIJlQCCIDDFRfiMVQOLLEAGANINA ' 
CDS ECWTPLHAAATCGHLHLVELIjIASGANLIJWNTDGNMP YDL 
CDDEQTLDCLETAMADRG ITQDS I EAARAVPELRMLDDIRSRLQ 
AGADLHAPLDHGATLLHVAAANGFSEAAALLLEHRASliSAKDQD 
GWEPLHAAAYWGOVPLVELLVAHGADLNAKS LMDETPLDVCGDE 
EVRAKIiljELKHKHDALLRAOSRQRSIiLRRRTSSAGSRGKVVRRV 
SLTQRTDLYRKQHAQEAIVWQQPPPTSPEPPEDNDDRQTGAELR 
P P P PEEDNP EWRPHNGRVGGS P VRHLYS KRLDRSVS YQLS PLD 
STTPHTLVHDKAHHTLADLKRQRAAAKLQRPPPEGPESPETAEP 
GLPGDTVTPQPDCGFRAGGDPPLLKLTAPAVEAPVERRPCCLLM 


bbbo 


1731 


587 


AEDRQPAS RRGAGTTAAMAAS GPG CRS WCLCPEVPSATFFTALL " 
SLLVSGPRLFH/3QPLAPSGLTLKS EALRNWQVYRLVTYI FVYE 
NP ISLLCGAI I IWR FAGNFERTVGTVRHCFFTVIFAI FS AI I PL 
S FEAVSSLSXLGEVEDARGFTP VA^AMLGVTTVRSRMRRALVPG 
MVVPSVLVpWLLLGASWLIPO/rSFLSWCGLSIGLAYGLTYCYS 
IDI^ERVALKLDOTFPFSLMRRISVFKYVSGSSAERRAAQSRKL 
NPVPGS YPTQS CHPHLS PSHPVSQTQHASGQ KLASWP S CTPGHM 
PTI*PPYQPASGLCYVQNHFGPNPTSSSVYPASAGTSLGIQPPTP 
VNSPGTVYSGALGTPGAAGSKESSRVPMP 


5569 ~ 


2 


835 


QTPCPLAWERGSRSEDISVPGQKPPTCSSFSGMDVGPSSLPHLG " 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(ARAlanine, C=Cysteine, D«Aspartic Acid, E» 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
HeHistidine, I-Isoleucine, K-Lysine, 
L^lieucine, M*Methionine, N=Asparagine, 
PoProline, Q^Glut amine, R»Arginine, 
S^Serine, T=Threonine, V=valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








LKLLLLLLLLPLRGQANTGCYGIPGMPGLPGAPGKDGYDGLPGP 
KGEPGI PAIPGIRGPKGQKGBPGLPGHPGKNGPMG PPGMPGVPG 
PMG I PGEPGBEGR YKQKFQS VFTVTRQTHQPPAPNS LIRFNAVL 
TNPQGDYDTSTGKFTCKVPGLYYFVYHASHTANLCVLLYRSGVK 
WTFCGHTSKTNQVNSGGVLLRLQVGEEVMLAVNDyYDMVGZQG 
SDSVPSGFLIiFPD 


5570 


264 


946 


RDRRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDT 
MSSPSPGKRRMDTDVVKLlESKHEVTILGGLNEFVVKFyGPQGT 
PYEGGVWKVRVDLPDKYPFKSPSIGFMNKI FHPN IDEASGTVCL 
DVINQTWrALYDI*TNIFESFLPX3LLAYPNPIDPLKGDAAAMYriH 
RPEBYKQKIKBYIQKYATEEALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5571 


264 


946 


RORRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDr 
M3SPSPGKRRMDTDWKLIESKHBVTILGGLNEFWKFYGPQGT 
P YEG GVWKVRVDLPDKYP FKSPS I GFMNKI FHPNI DEASGTVCIj 
DVINQTWTALYDLTNIFESFLPQLIAYPNPIDPLNGDAAAMYLH 
RPEBYKQKIKBYIQKYATEEALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5572 


2802 


2085 


RTDYRTGI TORRFRVMAAGDGDVlCLGTLGSGSESSNDGGSESPG' " 
DAGAAAEGGGWAAAAIJUjLTGGGEMLIjNVALVALVLLGAYRLWV 
R WGRRGLGAGAGAGEES PATS hPRMKKRDFSLEQ LRQYDGS RNP 
RILLAVNGKVFDVTKGSKFYGPAGPYGIFAGRDASRGIATFCIjD 
KD AIiRDE Y DD LSDLNAVQME S VREW EMQ FKEXYD YVGRLJjK PG E 
EPSEYTDEEDTKDHNKQD 




2562 


219 


VPART PNAEDQGP EARAAT AT PCQSGd RERAGEAAEDG VkMAAF 

sbmgvmpeiaqaveemdwllptdiqaesiplilgggdvlmaaet 
gsgktgafsipviqivyetlkdqqegkxgkttiktgasvlnkwq 
mnpydrgsafaigsdglccq3revkewhgcratkglmkgkhyyb 
v5ckdqglcrvgwstmqasldlgtdkfgfgfggtgkkshnkqfd 
nygee ftmhdt i gcyldi dkghvkfs kngkdlgla fe i pphmkn 

QALFPACVLKNAELKFWFGEEEFKFPPKDGFVALSKAPDGYIVTC 
S3HSGNA0VTQTKFLPNAPKALIVEPSRELAEQTLNNIKQPKKY 
IDNPKLRELLI IGGVAARDQLS VLENGVDIWGTPGRLDDLVST 
GKL^SQVRFLVLDEAIX3LI*SQGYSDFrNRMHNQIPQVTSDGKR 
LQVI VCS ATLHS FDVKKLS E KIMHFPTWVDLKGEDSVPDTVHHV 
VVPVNPKTDRLWERLGKSHIRTDDVHAXDNTRPGANSPEMWSEA 
I KILKGEYAVRAI KEHKMDQA II FCRTK1DCDNLEQYFIQQGGG 
PDKKGHQFSCVCLHGDRKPHERKQJNLERFFCKGDVRFLICTDVAA 
RGID1HGVPYVINVTLPDBKQNYVHR1GRVGRAERMGLAI SLVA 
TBKEKVT^YHVCSSRGKGCYOTRIiKEDGGCriWYNEMOLLSEIEE 
HLNCTI SQVBPDIKVPVDEFDGKVTYGQKRAAGGGSYKGHVDI L 
APTVQELAALEKEAQTSFLHLGYLPNQLFRTF 


5574 


1731 


952 


NEGLBVFKBQBLQPEDKGAVPEDASTERSAMASLGLQLVGYIU3 
LLGL LG TL VAMLLP S WKTS S YVG AS IVTAVGFS KGLWME CATH S 
TGITQCDI YSTLLGIiPADI QAAQAMMVTSSAISSLACI IS WGM 
RCTVFCQESRAKDRVAVAGGVFFILGGLLGFIPVAWNLHGILRD 
FYSPLVPDSMKFEIGEALYLGI ISSLFSLIAGI ILCFSCSCQRN 
RSNYYDAYQAQPLATRSS PRPGQPPKVKSEFNS YSLTGYV 


5575 


456 


766 


LlaWALPCPPPTAAAVIiLSSTGLMELLEKMbALTliAXADS 

LCSAWLLTASFSAQQHKGSLQKbpLLSQACVGCLEALLDYLDAR 

SPDIGRNSPHYLMFP 


5576 


249 


2146 


RS WGAP W FWRMRLLRRRHMP LRLAMVGCAFVLFLFLLkRDVS 3 R 
EEATEKP WLKSL VS RKDHVLDLMLEAMNNLRDS MPKLQ I RAP E A 
QO^PSINQSCLPGPYTPAEIiKPFWERPPQDPNAPGADGKAFQK 
SKWTPLETQEKEEGYKKHCFNAFASDRISLQRSIiGPDTRPPECV 
DQKFRRCP PLATTS VI IVFHNEAWSTLL RTVYS VLHTTPAILLK 
EI ILVDDASTEEHLKEKLEQYVKQLQWRWRQEERKGLITARL 
LG ASVAQA2 VLTFLD AHCE C FHG WL EP LLAR I AED KTVWS P D I 
VTIDLNTFEFAKPVQRGRVHSRGNFDWSLTFGWETLPPHEKQRR 
KDETYP I KS PTFAGGLFS 1 S KS Y FEK2 GTYDNQME I WGGENVEM 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
correaponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anu.no acid segment containing signal peptide 
(A**Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H«Histidine, I=Isoleuciae, K^Lysine, ! 
X»=Leucine, McMethionine, NsAsparagine, 
PaProline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine , V»Valiae, 
W-Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








SFRVWQCGGQLE1 1 PCSWGHVFRTKSPHTFPKGYSVIARNQVR 
LAEVWMDSYKKIFYRRNLQAAKMAQEKSFGDISERLQLREQLHC 
HNPSWYIJiNVYPEMFVPDLTPTFyQAJKNIX3TO0CLDVGBNNRG 
GKPLIMYSCHGLGGNQYFEYTTQRDLRHNIAKQLCLHVSKGALG 
LGSCHFTGKNSQVPKDEEWELAQDQIiIRirSGSGTCLTSQDKKPA 
MAPCNPSDPHQLWLFV 


5577 


3 


1275 


RNSDCSCGEISVHCLPWVLFILDLKVBSSMFCPLkLILLPVLljD 
YSL5LNDLNVSPPELTVHVGDSALMGCVFQSTBDKCI FKIDWTL 
SPGEHAKDEYVLYYYSNLSVPIGRFQNRVHLMGDILCNDGSLIjL 
QDVQEADQGTYlCBlRLKGESQVFKKAVVhHVhPEEPKSlMVKV 
GGLIQMGCVFO^TEVKHVTKVEWIFSGRRAKEEIVFRYYHIOiRM 

sveysqswghfqnrvnlvgdifrndgs imlqgvresdggnytcs 
zhlgnlvfkktivlhvs pbeprtlivtpaalrplvlggnqlvi iv 
givcatilllpvlilivkktcgwkssvnstvlvkntkktnpeik 
ekpchfbrcegekhiyspiivrevieeeepsekseatymtmkpv 

WPSLRSDRNNSLEKKSGGGMPKTQQAF 


S5**8 


3 


783 


avesmaspgagrappelperncgyreveywdqryqgaadsapyd 
wfgdfsspraij^pelrpedrilvlgcgnsalsyelflggfpnv 

TS VDY3S VWAAMQARYAHVPQLRWETMDVRKLDFPSAS FDWL 
e kg TLDALLAGERDPWTVSSEGVHTVDQVLSE VSR VLVPGGRFI 
SMTS AAPHFRT RHYAQAYYGWSLRHAT YGSG FHFHLYLMHKGG K 
LS VAQLALGAQII»S PPRPPTS PCFLQDSDHEDPLSAIQL 


5579 


3 


1540 


RNSGLARG AS ALARHGGGLAGG VG WDCGACAS RCQGVMEGLLTR 
CRALPAIATCSRQLSGYVPCTraHCAPRRGRRJuXiLSRVFQPQNL 
REDRVLS LQDKS DDLTCKSQRLMLQVGLI YPAS PGCYHX/LPYTV 
RAMEKIjVRVIDQEMQAIGGQKVNMPSLiSPAEliWOATNRWDLMGK 
ELLRLRDRHG KS YCLG PTHEEAI TAL IAS QKKLS Y KQ L P FL L YQ 
VTRXFRDEPRPRFGLLRGRKFYMKDMYTFDSSPEAAQQTY3LVC 
DAYCSLFNKLGriPFVKVQADVGTlGGTVSHEFQLPVDIGEDRXA 
1CPRCSFSANMETUDLSQMNCPACQGPLTKTKGIEVGHTFYLGT 
KYS S I FNAQ FTNVCG KP TLAEttG CY GIjGVTRI LAAAI E VLS TED 
CVR WPS LLAP YQACLI PP JCKGS KEQAASEL I GQLYDHITEAVPQ 
LHGEVLLDDRTHLTIGNRLKDANKFGYPFVIIAGKRAIiEDPAHF 
EVWCQNTGBVAFLTKDGVMDLLTPVQTV 


5580 


1681 


450 


ADAGTRC IPGF WPSGAGYSAPAQRGRRSSGRMRAAAAPGLTAP 
WRLLO^CELEAGELGMAVPAAAMGPSALGQSGPGSMAPWCSVSS 
G PS RYVLX^QELFRGIISKTREFLAHSAKVHS VAWS CDGRRLASG 
SFDKTASVFI^KDRLVKENNYRGHGDSVDQLCVJHPSNPDLFVT 
ASGDKT IR I WDVRTTKCIAT VNTKGEN INI CWS PDGQTIAVGNK 
DD WTF I DAKTHRSKAEEQFKFEVNE I S WNN DNNM FFI*TNGKG C 
INILSYPELKPVQSINAHPSNCICIKFDPMG1CYFATGSADALVS 
LiWDVDELVC VR CFSRLDWPVRTLS FSHDGKMLASAS EDH FIPIA 
EVETGDKLWEVQCBSPTFTVAWHPKRPI*LAFACDDKDGKYt)S SR 
EAGTVKLFGLPNDS 


' 3581 


54 


947 


GGGSGPRAPSATLLDTGESVAAVASGEDKGIAASAAAAAVFACS 
CS PDPQSSTMN PVYSPVQPGAP YGNP KKMAYTGYPTAYPAAAPA 
YNPSL YPTNS PS YAPEFQFLHSAYATLUvJKOAWPONSSS CGTEG 
TFHLP VDTGTENRT YQASS AAFRYTAGTP YKVP PTQSNTAP P P Y 
SPSPNP YO/TAM YPI RSA YPQQNL YAO^AYYTQPVYAAQPHVIHH 
TTWQPNS IPS AI Y PAPVAAPRTNG VAKGMVAGTTMAMSAQTL L 
TTPQHTAIGAHPVSMPTYRAQGTPAYSYVPPHW 


5562 


Ettc 

9 / JO 


2739 


I ITNNNNVI I PLVIAYHLSGS AQARGERS PAERLMERQKRKAD I 
EKGLQFIQSTLPLKQEEYBAFLLKLVQNLFAEGNDIjFREKDYKQ 
ALVQYMEGl^ADYAASDQVALPREIiLCKLHVNRAACYFTMGLY 

EKALEDS ekalgldses iralfrkaralnelgrhkeayecssrc 

SLALPHDESVTQLGQELAQKLGLRVRKAYKRPQELETFSLLSNG 
TAAGVADQGTSNGLGSIDDIETDCYVDPRGSPALLPSTPTMPLF 
PHVLDLLAPIiDS SRTLPSTDSLDDFSDGDVFGPEIJDTLIiDSIiS L 
VQGGLSGSGVPSBIiPQLIPVFPGGTPLLPPWGGSIPVSSPLPP 
ASFGLVMDPSKKLAASVLDALDPPGPTliDPLDLLPYSETRLDAL 
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SEQ 
ZD 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H*Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M»Methianine, NaAsparagine, 
PoProline, QaGlut amine, R^Arginine, 
S=Serine, TnThreonine , V=Valine, 
W-Tryptophan, Y»Tyroeine, X=Unknown, *«=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DSFGSTRGSLDKPDSFMBBTNSQDHRPPSGAQKPAPSPEPCMPN " 
TALLIKNPLAATHEFKQACQLCYPKTGPRAGDYTYREGLEHKCK 
RDI LI/3RLRSS EDQTW KR IRPR PTKTS FVGS YYLCKDM INKQDC 
KYGDNCT FAYHQEE I D VW TEB R KGTLNRDLL FDPLGG VKRG S LT 
IAKLLKE HQG I FTFLCE I CFDS KPRI IS KGTKDS PS VCSNLAAK 
HS F YNNKCLVH I VRS TSLKYS KIRQFQBHFQFDVCRHEVR YGCL 
REDS CHFAHS F I ELKVWLLQQY SGMTHEDIVQESKKYWQQMEAH 
AGKASSSMGAPRTHGPSTFDLQMKFVCGQCWRNGQVVEPDKDLK 
YCSAKARHCWTKERRVLLVMSKAKRKWVSVRPLPSIRNFPQQYD 
LCI HAQNGRKCQ YVGNCS FAHS PE ERDMWTFMKENKI LDMQQT Y 
DMWL KKHN PGKPGEGTP I S SREGE KQIQMPTDY AD I MMGYHCW L 
CGKNSNSKKQWQQHIQSE3CHKEKVFTSDSDASGWAFRFPMGEFR 
LCDRLQKGKAC?IX3DKCRCAHGQEE1iNEWIjDRREVLKQKLAKAR 
KDMLLCPRDDDFGKYNFLLQEDGDLAGATPEAPAAAATATTGE 


5S83 


3 


1265 


SSGCRQGRPGRSDRPRPPPRRHKMVKETRYYDILGVKPSASPEB" 
I KKA YRKLALKYHPDKNPDEGE KFKLISQAYEVLSDPKKRDVYD 
QGGEQAIKEGGSGSPSFSSPMDIFDMFFGGGGRMARBRRGKNW 
HQLSVTLEDLYNGVTKKLALQKNVI CEKCEGVGGKKGSVEKCPL 
CKGRGMH1HIQQIGPGMVQQIQTVCIECKGQGERINPKDRCESC 
SGAKVI REKKI I E VH VEKGMKDGQKILFHGEGDQEPELEPGDVI 
IVIiDQKDHSVFQRRGHDIiIMKMKIQLSEALCGFKtCTI KTLDNRI 
LVITS KAGEVIKHGDLRCVRDEGMP I YKAPLEKGILI IQFLVIF 
PEKHWLSLEKLPQLEALLPPRQKVRITDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


5584 


3 


I2£s 

r 


SSGCRQGRPGRSDRPRPPPRRHKMVKETRYYDILGVKPSASPEE"" 
I KKAYRKLALKYHPDKNPDEGE KFKLISQAYEVLSDPKKRDVYD 
QGGEQAI KEGGS GS PSFSS PMD I FDM FFGGGGRMARERRGKNW 
HQLSVTLEDLYNGVTKKLALQKNVI CEKCEGVGGXKGSVEKCPL 
CKGRGMHIHIQQ I GPGMVQQ I QTVC IECKGQGER I NPKDRCESC 
SGAKVI REKKI I EVHVEKGMXDGQK I L FHGEGDQE PELE PGDVI 
IVLDQKDHSVFQRRGKDL I MKMKIQLSEALCGFKKTI KTLDNRI 
LVITSKAGEVI KHGDLRCVRDEOIPI YKAPLEKGILI IQFLVTF 
PEKHWLSLEKLPQLEALLPPRQKVRITDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDG PQAG VQCQTA 


5585 ' 


2619 


915 


LPAGTPESSIjHEALDQCMTALDLFLTNQFSEALSYLKPRTKESM " 
YHSLTY ATILEMQAMP1TTDPQD ILLAGNM^EACfMLCQRHRRKS 
SVTDSFSSLVNRPriiGQFTEEEIHAEVCYAKCLLQRAALTFLQD 
ENMVSF I KGG I KVRNS YQTYKELDSLVQSSQYCKG3NHPH FEGG 
VKIX3VGAFNLTLSMLPTRIIJ^LEFVGFSGNKDYGLLQLEEGAS 
GHSFRSVLCVMLLLCYHTFLTFVIiGTGlTVNIEBAEKLLKPYLNR 
YPKGAI FLFIiAGRI E VI KGtf IDAAIRRFEECCEAQQHWKQFHHM 
CYWELMWCFTYKGQWKMSYFYADliLSKENCWSKA?YlYMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RRYFS SNP ISLP VPALEMMYI WNGYAV I GKQPKLTDGILB 1 1 TIC 
AEEKLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
I SANE KKI KYDHYL I PNALLELALLLMEQDRNEEAI KLLESAKQ 
NYKNYSMESRTHFRIQAATLQAKSSLBNSSRSMVSSVSL 


S$B<5 


2619 


915 


LPAGTPESSI^EALDQCOTALDLFLTNQFSEALSYLKPRTKESM 
YHSLT YAT I LEWQAMMTTOPQDI tiLAGNMMKEAQMLCQRHRRKS 
SVTDS FS S L VNRPTLGQ FTEEE IHAE VCYAKCLLQRAALTFIiQD 
ENMVS FI KGG I KVRN S YQTYKELDS LVQS SQ YCKGENHPHFEGG 
VKLGVGAFNLTLSML PTRILRLLEFVG FS GNKD YGLLQLEEG AS 

f*TJOOQ PITT /~ n rxAT T T OvrimoT I ■> cm #t rtm/^nn ntY ntry t trrs^rr %.ttn 

bHb r RS V JjCVWLiLi LC YHTFLTFVLGTaSNVN I EEAE KLLKP YLNR 
YPKGAI FLFLAGR IBVI KGNIDAAIRRFEECTEAQQHWKQFHHM 
CYWELMMCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RRYFSSNP ISLPVPALEMMYI WNGYAVIGKQPKLTDGILEI ITK 
AEEMLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
I S ANEKK IKYDHYLI PNALLELALLLMBQDRNEEAIKLLESAKQ 
NYICNYSMESRTHFRIQAATLQAKSSLENSSRSMVSSVSL 


5587 


1768 ■ 


148 


SSAVPDGAVGRPVAVAVGGP PHS CRCRPCCLMAAIGVHLG CTSA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, GcGlycane, 
HnHistidine, Islsoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, ReArganine, 
S-Serine, T-Threonine, V»Valine, 
W«Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








CVA VY K DG RAG WAN DAG DRVTPA WAYS ENE E I VGLAAXQ 
RNISNTVMKVKQILGRSSSDPQAQKYIAESKCI.VIBKNGKLRYE 
IDTGBETKFVKPEDVARLI PS KMKETAHS VLGSDAUD WITVPP 
DFGBKQKNALGEAAJIAAG FNVLRX IHBPSAALIiAYGIGQDS PTG 
KSNILVFKLGGTSIiSLSVMEVNSGIYRA^LSTNTDDNIGGAHFTB 
TLAQ YLAS EFQRS PKHDVRGNARAMMKLTNSAEVAKHSLSTLGS 
ANCFLD S L YEGQD FDCNVS RARFELLCS PLFNKCI EAI RGLLDQ 
NGFTADDINKWLCGGSSRIPKLQQLIKDLFPAVELLNS I P PDE 
VIPIGAAIEAGILIGKENLLVEDSLMIECSARDIIiVKGVDESGA 
SRFTVLFPSGTPLPARRQHTLQAPGSISSVCLELYESDGKNSAK 
E BTKPAQVVLQDLDKKENGLRDI LAVLTT4KRDG SLHVTCTDQET 
GKCBAISIEIAS 


55B8 


3 


509 


TPPPP EQAM VAATVAAAW L LLWAAACAQ QKQD FYD ?KAVN IRGK 
LVSLEKYRGSVSLVVWASECGFTDQHYRALQQLQRDtiGPHHFN 
VLAFPCNQFGQQEPDSNKEIESFARRTYSVSFPMFSKIAVTGTG 
AHPAFKYLAQTSGKEPTWNFWKYLVAPDGKWGAWDPTVSVEEV 
RPQ I TAL VRKL I LLKREDL 


5589 


1B84 


553 


LRQAWHEGGIGQTDKERGAAAIiPGEEGDPTRGRSLGRASWESGS 
PRRPRSP FSSFLPRPI CLSLEARPCS 3 EDRRNWS IjIGRPGAPAS 
GLNRS SGLWLG P DR CRPRS RCS CRVMENPS PAAALGKAL CALLL 
ATU3AAGQPLGGE S I CS ARAPAKYS I TFTGKWSQTAFPKQY PL F 
RPPAQWSSLljGAAHSSDYSmRKNQYVSNGLRDFAERGEAVJALM 
KZ I EAAGEALQS VHAVFSAP AVPS GTGQTS AELEVQRRKS LVS F 
VVRIVPSPDWFVGVDSItDLCDGDRWREQAALDLYPYDAGTDSG? 
TFSSPNFATIPODTVTEITSSSPSHPANSFYYPRLKALPPIARV 
TLLRLRQS PRAF 1 P PAP VLPSRDNETVDS AS V PETPLDCEVSLW 
SSWGIiOGGHCGRLGTKSRTRYVRVQPANNGSPCPEIjEEEABCVP 
DNCV 


5590 


72 


896 


LCSSGAJbRL LPAMVAWRS AFLVCIiAFS LATLVQRGSGDFDD flNL " 

EBAVKETS SVKQPWDHTTTTTTNRPGTTRAPAJCPPGSGLDLADA 

LDDQDTX3RRKPG I GGRERWNHVTTTTXRPVTTRAPANTLGND FD 

LADALDDRNDREDGRRKPIAGGGGFSDKDLEDrVGGGEYKPDKG 

KGDGRYGSNDDPGSGMVAEPGTIAGVASALAMALIGAVSSYISY 

(MKKFCFSI0^3GLNADYVKGENIjEAVVCEEPQVKYSTLHTQSAE 

PPPPPEPARI 


5591 


68 


1494 


AGSSRKAAAERIJjVSAGCRSIiAGRASGVLLLPAELli^GEEEAMA ' 
LRVTRNS KINAENKAK I NMAG AKRVPTA P AATSKPGLR PRTALG 
DIGNKVSEQLQAKMPMKKEAKPSATGKVIDKKXiPKPLEKVPMLV 
PVPVSEPVPEPEPBPEPEPVKEEKLSPEPILVDTASPSPMETSG 
CAPAEBDLCQAFSDVXLAVNDVVAEDQADPNLC3BYVKDX YAYL 
RQLEEEQAVRPKYLI/3REVTGNMRAILIDVfLVQVQMKFRLLQET 
M YMTVS I IDRFMQNNCV P KKMLQLVG VTAMFIAS KYEEMYPP E Z 
GDFAFVTDNTYTKHQI RQMEMKI LRALNFGLGRPLPLH FLRRAS 
KIGEVDVEQHTLAKYLMELTMLDYDMVHFPPSQIAAGAFCLALK 
ILDNGEWTPTLQHYLS YTEESLLPVMQHLAKNAAMVNQGLTKHM 
TVKNKYATS KHAKI ST LPQLNS ALVQDLAKAVAKV 


5592 


242 


924 


YGES KDWNQ KD IjIjS AL VLTTVN Cl»PT P I MAJCS AEVKLAI FGRAG ' 
VGKSALWRFLTKRFI WE YDPTLES TYRHQATIDDEWSMEILD 
TAGQEDTIQREGHMRWGEGFVLVYDITDRGSFEEVLPLKNILDE 
IKKP KrWTLILVGNKADLDHSRQVSTEEGEKLATELACAFYECS 
ACTGEGNITEI FYELCRE VRRRRMVQGKTRRRS STTHVKQAINK 
mux ubo 


5593 


3 


1113 


HA5GGRAANMAAERGAGQQQSQEMMEVDRRVESEESGDEEGKKH " 

SSGIVADIiSEQSLKDGEERGEEDPEEEHELPVDMETINLDRDAE 

DVDLNHYRIGKIEGFEVLKKVKTLCLRQNIjIKCIENLEELQSLR 

8ldl ydnqi )gci enlealtele ildi s fnllrniegvd kltrlk 
klflvnnkiskienlsnlhqlomlelgsnriraienidtltnle 
slflg knkitklo^ldaitnltvlsmqsnrltkxeclqnlvnlr 
elylshng i evibgleknnkltmiidiasnri kkieni shltelq 
e pwmndnlle s ws dldeiikg arsletvylern plqkdpqyrrkv 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=*Aspartic Acid, E» 
Glutamic Acid, ^Phenylalanine, GsGlycine, 
H=Hisfcidine, Islsoleucine, K=Lysine, 
L»Leucine, M=Methionine, NoAsparagine, 
P=Proline, Q-Glutamine # R-Arginine, 
S«Scrine, T»Threonine, V»Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 








MLALPSVRQIDATFVRF 


5594 


3 


1113 


HASGGRAANMAAERGMO^QSQEMMEVDRRVESEESGDEEGKKH 
SSGIVADL5EQSLKDGEERGEEDPEESHBLPVDMETINLDRDAB 
DVDLNHYRIGKIEGFEVLKKVKTLCLHQNLI KCIENLBELQSLR 
BLDLYDNQI KKI ENIiBALTELE I IiDI SPNLLRN IEGVDKLTRIjK 
IQjFLVNNKlSKIEt^SNLHQJjQMLELGSNRIRAISNIDTLTNLE 
SLFLGKNKITKLQNLDALTNLTVLSMQSNRLTKI EGLQNLVNLR 
ELYLSHNGIEVIEGLENNNKLTMLDIASNRIKKI2NISHLTELQ 
EFWMNDNLLBSWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
MLALPSVRQIDATFVRF 


5595 


3 


1476 


ARWNGRWVQVPAWPGPGCGTNASGER.QRQLPRAWRPVGRTLGSE 
PIAIAWSPPLYLPPIPLPSWAVSQPTPTLGTMFADLDYDIEEDK 
LGIPTVPGKVTLQKDAQNLIGISIGGGAQYCPCLYIVQVPDWTP 
AALDGTVAAGDE I TGVNGRS I KGKTKVEVAKMI QEVKGE VT I HY 
NKLQADPKQGMS LDIVLKRVKHRLVENMSSGTADALGLS RAI LC 
NIX3LVKRLEELERTAELYKG>TrEHTKNLLRAPYEI^QTHRAPGD 
VPS VIGVREPQPAASEAFVXFADAHRS IEKFG I RLLKT I KPMLT 
DLNT YLNKAI PDTRIiTI KKYLDVKPE YLS YCLKVKEMDD EE YS C 
I ALGEP L YRVS TGNYEYRL I LRCRQEARARFSQMR KDVLE KMSL 
LDQKHVQDXVFOLQRLVSTMS KYYNDCYAVLRDAD VFP I EV DIjA 
HTTIAYGLNQEEFTDGBEEEEEEDTAAGEPSRDTRGAAGPLDKD 
GSWCDS 


559^ 


" 698 


219 


GAVLAPSSLPAAELAAQGES QSLEDLSNTSRPTSEVYKISFI FP 

ngdkydgdctr'tssgiyerngiqihttpngivytgswkddkmng 
fgrleh psgavye gqfkdnmfhg lgtytfpng akytgnfnenrv 
kgegeythiqgtrmdvvtfhftscsq? 


5597 


3 


731 


isckmaadgqsslpaswrsvtlthveypagdlsghllaylslsp 
vfvivgfvtliifkreiihtisfiiggijujnegvnwliknviqepr 
pgggphtavgtkygmpsshsqfmwffsvysflflylrmhqtnna 
rfij)llwrh\^lglijwafi*vsysrvyllyhtwsqvlyggiag 
glmaiawfiftqevltplfpriaawpvsefflirdtslipnvlw 

FEYTVTRAEAKNRQRKLGTKIiQ 


~559B 


32* 


244 0 


GIGPIAASFIFCKVASLYIFLSPPPPSVSGVPYSPANSSWSCAI, 
VPLLGSGVPPHPPAPS PCCSGQTKLKMLSFKLLIiLAVALGFFEG 
DAKFGERNEGSGARRJRRCLNGNPPKRLKRRDRRMMSQLELLSGG 
EMLCGGFYPRLS CCIiRSDS PGLGRLENKI FSVTNNTECGKLLEB 
1 KCALCS PHSQ S LFHS PBREVLERDLVLPLLCKDYC KEFFYTCR 
GH1 PGFLO^TADEFCFYYARKDGGIjCFPDFPR KQVRGPASNYLD 
QMEEYDKVEE1SRKHKHNCFCIQEVVSGLRQPVGALHSGDGSQR 
LFILBKEGYVKILTPEGEI FKEPYLDIHKLVQSGIXGGDERGLL 
SLAFHPNYKKNGKLYVSYTTNQERWAIGPHDHILRVVEYTVSRK 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLYI I LGDGM 
ITLDDMEEMDGLSDFTGSVLRLDVDTDM CNVPYS I PRSNPHFNS 
TNQPPEVFAHGLHDPGRGAVDRHPTDININLTILCSDSNGKNRS 
SARILQIlKGKDYBSEPSLLEFKPFSNGPLVGGFVYRGCQSBRIi 
YGSYVFGDRNGNFLTLQQSPVTKQWQBKPLCLGTSGSCRGYFSG 
HILGFGEDELGEVYrLSSSKSMTQTHNGKLYKIVDPKRPLMPEE 
CRATVQ P AQTLTS ECS RLCRWG YCTPTG KCCCS PGWEGD FCRTG 


5599 


326 


2440 

L 


GIGPIAASFIFCKVASLYIFLSPPPPSVSGVPYSPANSSWSCAi 

VPLLGSGVP phppapspccsgqtmlkmls FKLLLLAVALGFFEG 
DAKFGERNEGSGARRRRCLKGNP PKRLKRRDRRMMSQLELLSGG 
EMLCGGFYPRLSCCLRSDSPGIiGRLENKI FSVTNNTECGKLLEE 
I KCALCS PHS QS L FHS PERB VLERDLVLPLLCKDYCKEFF YTCR 
GHIPGFLQTTADEFCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMEEYDKVEEXSRKHKHNCFCIQEVVSGLRQPVGALHSGDGSQR 
L FILEKBGYVfdLTPEGEI FKEP YLDIHKL VQSGI KGGDERGLI* 
SLAFHPN YKKNGKLYVS YTTNQERWAIGPHDH I LRWE YT VSRK 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLYI ILGDGM 
I TLDDMEEMDGLSDFTGS VLRLD VDTDMCNVP YS IPRSMPHFNS 
TNQPPEVFAHGLHDPGRCAVDRHPTDININIiTILCSDSNGKNRS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F* Phenylalanine, G«Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, Methionine, NoAsparagine , 
P=Proline, Q^Glutamine, R^Arginine, 
S«Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SARILQIIKGKDYESEPSLLEPKPPSNGPLVGGFVYRGCQSERI, 
YGSYVFGDRNGNFLTLQQSPVTKQWQEKPLCLGTSGSCRGYFSG 
HILGFGEDEI^EWILSSSKSMTQTHNGKLYKIVDPKRPLMPEE 
CRATVQPAQTLTSECSRIjCRNGYCTPTGKCCCSPGWEGDFCRTG 


5600 


1977 


1244 


SLRVLSGHLMQTRDIiVQPDKPASPKFIVTLDGVPSPPGYMSDQE 
EDMCFEGMKPVNQTAASNKGLRGIiLHPQQLHLLSRQLEDPNGSF 
SNAEMSELSVAQKPEKLLERCKYV7PACKNGDECAYHHPISPCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 
TFYHPTINVPPRHALKWIRPQTSE 




1977 


1244 


SLR\TLSGHljMQTRDLVQPDKPAS PKFI VTLDG VPS P PGYMSDOE 
EDMCFBGMKPVNQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 
SNABMSELSVAQKP EKLLERCKYW PACKNGDECAYHHP IS PCKA 
FPNCKFAE KCL PVHPNCKYDAKCTKPDCPFTHVSRRI P VLS P KP 
AVAPPAPPSSSQLCRYFPACKKMBCPFYHPKHCRFNTQCTRPDC 
TFYHPTINVPPRHALKWIRPQTSB 


S$02 


246 


766 


YHTS CTIVWRTAKEALENTEVPVGCIiMVYlWEVVGKGRNEVNQTK 
^TRHAEMVAIDQVLDWCRQSGKSPSEVFEH-rVLYVTVEPCIMC 
AAALRl^MKIPLVVYGCQNERFGGCGSVIiNXASADLPNTGRPFQC 
rPGYRAEEAVEMLKTFYKQENPNAPKSKVRKKECQQILNMF 


5603 


1 


565 


FRGRT P I SGGERGCAQ Y P I PATP ARSGENRTM PGAGDGGKAPAR 
WLGTGLLGLFLLPVTLSLEVS VGKATD I YAVNGTE I LLPCTFS S 
CFGFBDLHFRWTYNSSDAFKILIEGTVKNEKSDPKVTLKDDDRI 
TLVGSTKEKR^ISIVLRDLEFSDTGKYTOIVKNPICENNLQHHA 
TI FLQWDRRMQ 


5604 


1 


1506 


BDIFPAQLLKLQRKERVWQQE ppvrdhrswggsgaggVagrewt 
DQGQVAI/SGHYMAEGEGYFAMSEDEIiACSPYIPIjGGDFGGGDFG 

rlixsilsetipihgrgnfptlei^pslivkvvrrriaekrigvr 

DVRLNGSAASHVLHQDSGLGYKDLDL I FCADIiRGEGEFQTVKDV 
VliDCnjLDFLPEXSWKEKITPLTLKEAYVQKMVKVCNDSDRWSLI 
SLSNNS G KNVEL KFVDS LRRQFE FSVDSFQI KLDS LLL F YECS E 
NPMTETFHPTIIGESVYGDFQEAFDHLCNKI IATRKPEEIRGGG 
LLKYCNLLVRGFR PASDE I KTLQRYMCSRFFIDFSDIGEQQRKL 
ES YLQNH FVG LEDRKYE YI*MTLHGWNES TVCLMGH E RRQTLNL 
ITMLA IRVLAOQNVI PNVANVTCYYQPAPYVADANFSNYYIAQV 
QPVFTCQQQTYSTWLPCN 


S605 ■ 


35 


""" 1821 


SQRS CPRS PS BPAP PWARCS NPDS RTGGVP VPRAWSAGGP ALGIi 
MAAP VR LGR KRPIj PACPNPL F VRWLTEWRDEATRSRHRTR FVFQ 
KALRSLRRYPLPIiRSGKEAKILQHFGJDGIiCRMLDERLQRHRTSG 
GDHAPDSPSGENSPAPQGRIiAEVQDSSMPVPAQPKAGGSGSYWP ' 
ARHSGARVIliLVLYREHLNPNGHHFLTKEELLQRCAQKSPRVAP 
GSARPWP ALRS LLHRNLVLRTHQ PARYSLTPEGLELAQ KLAESE 
GLSLLNVG IGPKE PPGEETAVPGAASAEIiASEAGVQQQPLELRP 
GEYRVLLC^IGETRGGGHRPEIjLREliQRLHVTHTVRKIjHVGI)F 
VV^AQETNPRDPANPGELVLOHI VERKRLDDLCSS 1 1 DGRFREQ 
KFRL KRCG L E RR VY LVE EHGS VHNLS LPE S TLLQ A VTNTQ VI DG 
FFVKRTADI KESAAYLALLTRGLQRLYQGHTLRSRPWGTPGNPE 
SG AMTSPN P LCS LLT FS D FNAGA I KNKAQS VREVFARQLMQ VRG 
VSGE KAAALVDR YSTPAS LLAAYDACATP KE QETLLST I KCGRL 
QRNLG PALS RTLSQL YCS YGPLT 


5606 


3 


1099 


GRSRCPGPGARGGTMS PRS CLRS IiRL LVFAVFS AAASNWLYIiAK 
LSSVGS ISEEETCEKIjKGLIQRQVQMCKRNLEVMDSVRRGAQIiA 
IEECQYQFRNRRWNCSTLDSLPVFGKWTQGTREAAFVYAISSA 
GVAFAVTRACSSGELEKCGCDRTVHGVSPQGFQWSGCSDNIAYG 
VAFSQSFVDVRERSKGASSSRALMNLHNNEAGRKAILTHMRVEC 
KCHG VS G S CSV KT CW RAVP PFRQ VGHALKEKFDGAT EVE P RRVG 
SS RALV? RNAQ FKPHTDEDLVYLE PS PDFCEQDMRSGVLGTRGR 
TCNKTS KAIDGCELLCCGRGFHTAQVE LAER CS CKFHWCCFVKC 
RQCQRIiVELHT CR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
ift-ftieninc , <-=cys teme f D^Aspartic Acid, E« 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
j-i xiiu f i »— fie Liiiufiine , w=Asparagine , 
P=Proline, Q»Glut amine, R-Arginine, 
SoSerine, T=Threonine, V«Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5607 


521 


141 


PPVCNPAEAMPSPGTVCSI^U4MT.WLDLAMAGSSFLSPEHQRV 
QQR KESKKP PA FaQ PRALAGWLRP EDGGQAEGAEDELEVRPNAP 
r u vijjxrojo v y x yyrlo yAiXs KF XjQD 1 LWEE AKEAPADK 


5608 


2 


983 


WFQSPLRQADPGPPRHTLFMDPVAGAIGGVCGDAVGYPLDTVKV 
RIQTEPKYTGIWHCVRDTYHRBRVWGFYRGIiLIjPVCTVSLVSSE 
VFGTYRHCLAHICRLRFGNPDAKPTKAD3TLSGCASGLVRVFLT 
SPTEVAKVRLQTQTQAQKQQRRLSASGPIAVPPMCPVPPACPEP 
KYRGPLHCLATVAREEGLCGLYKGSSALVLRDGHSFATYFLSYA 
VLCE WLSPAGHS RPD VPGVLVAGGCAGVLAWAVATPMDV1 KS RL 
QADGQGQRRYRGLLHCMVTIVREEGPRVLPKGLVLNCCZRAFPVN 
MWFVAYEAVLRLARGLLT 


' 5609 


1628 


304 


AKGVWVLPSPPPRPGRGALVSGSGLRRGRSGTSWRPRRMNHKSK 
KRIREAKRSARPELKDSLDWTRHNYYESFSLSPAAVADNVERAD 
ALQLSVEEFVERYERPYKPVVLLNAQEGWSAQEKWTLERLKRKY 
RNQKFKCGEDNDGYSVXMKMKYYIEYMESTRIIDSPLY1FDSSYG 
EHPKRRKLLBDYKVPKFFTDDLFQYAGEKRRPPYRWFVMGPPRS 
GTGIHIDPLGTSAWNALVQGHKRWCLPPTSTPRELIKVTRDEGG 
NQQDEAITWFNV I YPRTQLPTW P PEPKPLE I LQKPGE TVFVPGG 
WWII WLNLDTTIAI TQNFAS STNFP WWHKT VRGR PKLS RKWYR 
ILKQEHPELAVIADSVDIiQESTGIASDSSSDSSSSSSSSSSDSD 
SB CE5 GSEGDGTVHRRKKRRTCS MVGNGDTTSOJDDCVS K ER5 SS 
R 


5610 




1196 


LE R TPASADMAW TKYQL FLAGLMLVTG £ INTLS AKWADN FMAEG 
CGGS KEHS FQHP FLQA VGM FLGE FS CLAAFYLLRCRAAG Q S D SS 
VDPQQ P FNPLLFLPPALCDMTGTS LMYVALNMTSASS FQMLRG A 
VI I FTG LPS VAF LG RRL VLS Q WLG I LAT I AGL VWGLADLLS KH 

DSQHIQjSEVITGDLLI imaqiivaiqmvleekfvykhnvhplra 
vgtegl fgfvi lslll vpm yyi pags f sgnprgtledaldafcq 

VGQQPLIAVALLGNISSIAFFNFAGISVTKELSATTRMVLDSLR 
TVVIWAliSLALGWEAFHALQILGFIiILLIGTALYNGLHRPLLGR 
LSRGRPLAEES EQERLLGGTRTP I NDAS 


5611 


2 


577 


FVL PNRLGI PGS TFRGPGACASSSSLAAS AKPGAGGS PALANSG 
ELSNRFQGGKAFGLLKARQERRLABINREPLCDQKYSDEENLPE 
KLTAFKEKYMEFDLNireGEIDLMSLKRJWEKL 
ISEVTGGVSDTI SYRDFVNMMLGKRSAVIjKLVMMFEGKANESSP 
KPVGPPPERDIASLP 


5612 


1 


721 


ASRDGYMDATIAPHRIPPEMPQYGEENri I FELMQAMWIiCKHLNS 

slltlenlilnefsytatearrlylqrktvpsallvqliqerla 
eedcikqgwildgipetrbqalriqtlgitprhvivlsapdtvl 
iernlgkridpqtgeiyhttfdwppbseiqnrlmvpediselet 
aqklleyhrni vrvi psypki lkvisadqpcvdvfyqaltyvqs 
nhrtnapftprvlllgpvgs 


" 5613 


115 


1 0*1 Q 


RGVDPALRRAEKMIjPIiS I KDDEYKPP KFNL FGKISG W FRS I LSD 
KTSFl^ FFFLCLNLS FAF VELL YX3IWSNCLGLISDS FHMF FDST 
AILAGLAASV1 SKWRDNDAFS YG YVRAEVLAGFVNGLFLI PTAF 
FI FSEGVERALAPPDVHHERLLLVSILGFWNLIG I FVFKHGGH 
GHSHGSGHGHSHSLFNGALDQAHGHVDHCHSHEVKHGAAHSHDH 
AHGHGHFlISHMPSLKETTGPSRQILQGVFLHIIiADTLGSIGVI 
ASAIKMQNTFGLM1ADPICSILIAILIWSVI PLLRE SVG ILMQR 
TP PLLBNS LPQCYQ RVQQLQGV YS LQEQHFWTLCSDVYVGTLKL 
I VAPDADARWILSQTHNI FTQAG VRQLYVQ I DFAAM 


5614 


3 


1268 


LLSRNEHACPLQAGLGIiTQRKPKAIRGREGRATNQGQGETQNER 
APWGARQRLGVMAELQQLQEFE I PTGRBALRGNHSALLRVADYC 
EDNYVQATDKRKALE ETMAFTTQALAS VAYQ VGNLAGHTLRMIiD 
LQGAAXRQVEARVSTI^QMVNMHMEKVARRE IGTLATVQRLPPG 
QKVIAPENL PPLTP YCRRPLNFGCLDD I GHGI KDLSTQLSRTGT 
IiSRKSIKAPATPASATLGRPPRIPEPVHLPWPDGRLSAASSAS 
SLASAGSAEGVGGAPTPKGOAAPPAPPLPSSLDPPPPPAAVEVF 
QRPPTLEELSPPPPDEELPLPLDLPPPPPLDGDELGLPPPPPGF 
GPDEPSWVPAS YliEKWTLYPYTSQKDNELS FSEGTVT CVTRRY 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
1 nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A=Alanine, C-Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q-Glutaraine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 


5615 


9 


1558 


SDGWCEGVStihXSTGFFPGNYVBPSC 

AIiGRRR P<5DPREMEAAATPAAAGAAHREELDMDVh4RPL INEQNF 
DGTSDE BHEQELLPVQKHYQLDDQEG2 S FVQTLMHLLKG27IGTG 
LLGLPIAIKNAGIVLGPISLVPIGIISVHCMHILVRCSHFLCLR 
FKKSTLGYSDTVSPAMEVSPWSCLQKQAAWGRSVVDFFLVITQL 
GFCSVYIVFLAENVKQVHEGFLESKVFISNSTNSSNPCERRSVD 
LR I YMLCFL PFI I LLVFIRBL KNL FVLS FLANVSMAVS LVI I YQ 
YWRNK P D PHNLP I VAG WKKYPLF FGTAVFAFEGI G VVLPLENQ 
MKESKRFPQAI^IGMGIVTTLxVTLATLGYMCFHDEIKGSITLN 
LPQDVWLYQSVKILYSFGI PVTYS IQFYVPAE 1 1 IPGI TSKFHT 
KWKQICEFGIRSFLVSITCAGAILIPRLDIVISFVGAVSSSTIA 
LIL?PLVEILTFSKEHYN1WMVLKNISIAFTGWGPLLGTYITV 
BE 1 1 YPTP KWAGTPQS P FLNIiNSTCLTSGLK 


5616 


1 


j 719 


DDFVRCGPOSAAMGASARLLRAVIMGAPGSGKGTVSSRITTHFE' 
LKHLS5^DLLRD^J^uURGTEIGVIAKAPID0^KLIPDDV^^ ,, RIiAL 
HEIiKNLTQYSWLLDG FPRTLPQAEALDRAYQ I DT VTNLNVPFEV 
IKQRLTARWrHPASGRVYNIEFNPPKTVGIDDLTGEPLIQREDD 
KPETVI KRLKAYEDQTXP VLEYYQKKGVLETFSGTETN KI WP Y V 
YAFLQTKVPQRSQKASVTP 


5517 


176 


\ 765 


P WRGRGS RPRG AGAMAEEQ VNRSAG1APDCEASATAETT VS SVG 
TCEAAG KS PE PXDYDS TCVFCR I AGRQD PGTELLHCENEDLI CF 
KD I KPAATHHYL W P KKH I GNCRTLR KDQVE LVENMVT VG KT I L 
ERNNFTDFTNVRMGFHMPP PCS I SHLHLHV1APVDQLGFLS KLV 
YRVNSYWFI TADHLI EKLRT 


" 5618 


3 


1692 


YLNYINLKSENKLSGKEDLWBKLQYLWKSTLNLPEDLLRVPDES 
LFLNSGGDSLKS IRLLSBI EKLVGTS VPGLLE I ILSSS ILEIYN 
HILQTWPDE DVTFR KSCATKR KLSNINQE BAS GTSLHQKA I MT 
FTCHNEINAFWLSRGSQILSLNSTRFLTKLGHCSSACPSDSVS 
QTNIQNLKGLNSPVLIGKSKDPSCVAKVSEEGKPAIGTQKMELH 
VRWRSDTGKCVDASPLWI PTFDKSSTTVYIGSHSHRMKAVDFY 
SGKVKW2Q I LGDR IESS ACVS KCGNF I WG CYNGL VYVLKS NSG 
EKYWMFTTEDAVKSSATMDPTTGLIYIGSHDQHAYALDIYRKKC 
VWKS KCGGTVFS S PCLNLI PHHL YFATLGGLLLAVN P ATGNV I W 
KHS CGKPLFSS PQCCSQYI CIGCVDGNLLCFTHFGEQ VWQFS TS 
GP I FS SPOTS PS EQKI FFGSHDCPI Y CCNMKGHLQWKFETTS RV 
YATPFAFHNYNGSNEMLLAAASTDGKWILESQSGQLQSVYELP 
GEVFSSPWLESMLIIGCRDNYVYCLDLLGGNQK 


5619 


2160 [ 


1477 


DSPVLPTSGlWISTAQPAQPWSAVEAALRSIiGSPPGAdRGCPCP 
AQS LHS HQ LAAWDP LKPS LRS YPPHLLQHPQLRS ltassghlgr 
RSCPQPRPLEELLRAGSSTRPQPLTSSCCGMSCMYSFLGHCSVL 

LWGTKGRGSGSPS s pgcclhppaqhsqdlplvhvdvgwqpplgp 

TVGLRPGLI/3ERQRGALRAGDPQOQCP L PATVREDLGVPSP WAA 
ECSPPATP 


5^20 


930 I 


182 


PLPPPTLAMFLTRSEYDRGVNTFSPEGRLFQVEYAlEAIKLGST • 

AIGIQTSEGVOiAVEKRITSPLMEPSSIEKIVEIDAHXGCAMSG 

LIAnAKTLIDKARVETI^NHWFTYNETMTVESVTQAVS^ALQFG 

EEDADPGAMSRPFGVALLFGGVDEKGPQLFHMDPSGTFVQCDAR 

AIGSASEGAQSSLQEVYHKSMTLKEA1KSSLIILKQVI4EEKLNA 

TNIELATVQPGQNFHMFTKEELEEVIKDI 


5621 


3 


819 


WE F VE Y TATDAN VKNES hS S VQQLG I KMT VRYGKFIiS LLKDGA 
ENDLTWVLKHCERFIjKQQQTS IKSSLLCLQGNYAGHDWFVSSLF 
MIMLGDKEKTFQFLHQFSRLLTSAFLWLPRLHISSYIjPNDTVES 
GIHPVYFCSTHY1EMLLICAELPLVFSAFHMSGFAPSQICLQWIT 
QCPWNYIJ)MIEICHYIATCVFIjGPDYQVYICIAVFKHLQQDIIiQ 
HTQTQDLQVFLKEEALHGFRVSDYFE YME ILEQNYRTVLLRDMR 
NIRIjQST 


5622 


1122 


456 


AASTKDAVSRKRSHSASEKSGTGTSISKRLNMNPQIRNPMKAMY 
PGT FYFQ F KNLWE ANDRNETWLC FTVEG I KRRSWS W KTGVFRN 
QVDSETHCHAERCFLS W FCDDILS PNTKYQVTWYTS WS PCPDCA 
3EVAEFLARHSNVNLTI FTARLY YFQYP CYQEGLRS I»S QEGVAV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid* 
residue of 
amino acid 
sequence 


Anu.no acid Begraent containing signal peptide " 
(A-Alanine, C«Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, X=Lysine, 
L=Leucine, M=Methionine, NoAsparagine , 
P=Proline, Q=Glut amine, R=*Arginine, 
S=Serine, T=Threonine, v« Valine, 
W=Tryptophan, Y-Tyrosine, X»Un3cnown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








BIMDYEDFKyCWENFVYlTONEPFKPWKGLKTNFRLLKRRLRESC 
Q 


5623 


3 


554 


flpffirapkisrngqwlftpttpfpfankalpgWegivpacfw 
r kki ltpstgtme llq vt i lfllps i cssnstgvleaanns l w 

TTTKPSITTPNTESLQKNVVTPTTGTTPXGTITNELLKMSLMST 
ATFLTSKDEGLKATTTDVRKNDS I ISNVTVTS VTLPNAVSTLQS 
SKPKTETQSS I KTTEI PGS VLQPDASPSKTGTLTSX PVTI PENT 
SQSQVIGTEGGKNASTSATSRSYSSIILPWIALIVITLSVFVL 

VGLYRMCWKADPGTPENGNDQPQSDKESVKLLTVKriSHESGBH 
SAQGKTKN 


5624 


159 


B98 


PG VAAAAGAL PQ YHG PA PALVS C RR ELS LS AGS LQLEHKRRDFT 
SSG SRKLY FDTHALVCLLEDNGFATQQAEI I VSALVKI LEANMD 
I VYKDMVTKMQQE ITFQQVMSQIANVKXDMI ILEXSEFSALRAE 
NEK2KLE1^QLKQQVMDEVIKVRTDTKI*DFNI*EKSRVKELYSLN 
EKKLLELRTEIVALHAQQDRALTQTDRKIETEVAGLKTMLESHK 
LDN I KYLAGS I FTCLTVALG FYRLW I 


5625 


1 


1180 


TI PSS AAAQRAG P P AGALE ALS PGGARAHAERRG EMRAT PLAAP 
AGSLSRKKRLELDDNLDTERPVQKRARSGPQPRLPPCLLPLSPP 
TAPDRATAVATASRLGPYVLLEPEEGGRAYQALHCPTGTBYTCR 
VYPVQEALAVLEPYARLPPHKHVARPTEVLAGTQLLY7VFFTRTH 
GDMHS LVRSRHR I PEPEAAVL FRQMATALAHCHQHGLVLRDLKL 
CR FVFADRERKKLVLENLEDS CVLTGPDDSLWDKHAC PAYVGPE 
ILSSRASYSGKAADVWSLGVALFTMLAGHYPFQDSEPVLLFGKI 
RRGAYALPAGLSAPARCLVRCLLRREPAERLTATGILLHPWLRQ 
DPMPLAPTRSHLWEAAQWPDGLGLDEAREEEGDREWLYG 


5626 


3123 


2011 

7 


P PRALGS VAMENQ VLTPHVYWAQRHRE L YLR VELSDVQNPAI SI 
TENVLHFKAQGHGAKGDNV YB FHLEFLDLVKPE PVYKLTQRQVN 
ITVQKKVSQWWBRLTKQEKRPLPLAPDFDRWLDESnAEMELRAK 
EEERLNKLRLES EGS PETLTNLRKG YL FM YNLVQFLGFS W I FVN 
LTVRFC I LGKES FYDTFHTVADMM YFCQMLAVVETINAAIGVTT 
SPVLPSLrQLLGRNFILFI I FGTMEEMQNKAWFFVFYLWSAI E 
I FR YS FYMLTCI DMDWKVLTWLRYTLW IPLYPLGCLAEAVSVIQ 
SIPIPNETGRFSFTLPYPVKIKVRFSFFLQIYLIMIFLGLYIWF ■. 
RHLYKQRRRRYGQKKKICIH 


' S6"27 


3123 


2011 


P PRALG S VAMEKQVTjT P HV YWAQRHREL YLRVELS DVQNPAI S I 
TENVLHF KAQGHGAKGDNVYEFHLEFLDLVKPEPVY KLTQRQVN 
ITVQKKVSQWWEIU»TKQEKRPLFLAPDFDRWLDESDAEMEIiRAK 
EEERLNKLRLESBGS PETLTNLRKG YLFMYNLVQPLGFSWIFVN 
LTVRFCIl^KESFYDTFHTVAOMMYFCQMLAVVETINAAIGVTT 
SPVLPSLIQLLGBNFILFIIFGTMEEMQNKAWFPVFYLWSAIE 
I FR YS F YMLTCIDMDWKVLTWLR YTLWI PLYPLGCLAEAVSVIQ 
S I P IFNETGRFSFTLP YPVKI KVRFSFPLQI YLIMI FLGLYINF 
RHLYKQRRRRYGQKKKKIH 


5628 ■■■ 


75 * 


1455 


VAGAMASKCLKAGFSSGSLKSPGGASGGSTRVSAMYSSSPCKLP 
ShS P VARS FSACSVGLGRSS YRATS CLP AL CL ? AGG FATS YSGG 
GGW FGEG I LTGNEKETMQS LNDRLAGYLEKVRQLEQENAS LESR 
I RE W CEQQVP YMCPDYQS YFRTI EELQKKTLCS fCAENARLWE I 
DNAKLAADDFRTKYBTEVSLRQLVESDINGLRRILDDLTLCKSD 
LEAQVESIJCEELLCLKKNHEEEVNSLRCQLGDRLNVEVDAAPPV 
DLNRVLEEMRCQYBTLVENNRRDAEDWLDTQSEELNQQWSSSE 
QLQSCQAEI IELRRTVNALEIELQAQHSMRDALESTLAETEARY 
SSQLAQKQCMITNVEAQLAEIRADLERQNQEYQVLLDVRARLEC 
EINTYRGLLESEDSKLPCNPCAPDYSPSKSCLPCLPAASOGPSA 
ARTNCSARPICVPCPGGRF 


5629 


2287 


938 


GRPRSSSEttfRNFLRERAGLSSAAVQTRIGNSAASRRSPAARPPV 
PAPPALPRGRPGTEGSTSLSAPAVLWAVAVWWVSAVAWAMA 
NYIHVPPGSPEVPKLNVTVQDQEEHRC!REGALSLLQHLRPHWDP 
QBVTLQLFTDG I TNKLIGCYVGNTMED WLVRI YGNKTE LLVDR 
DBEVKSFRVLQAHGCAPQLYCTFNNGLCYEFIQGEALDPKHVCN 
PAI FRL I ARQLAKI HAIHAHNGWI PKSNLWLKMGKYFSL I PTGF 
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1 ss?r 

ID 
NO: 

i- 


| Predicted 
oeg uin x ng 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acad segment containing signal peptide 
{A«Alanine, Cysteine, D*Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine f 
H=Histidine, I«Isoleucine, K=Lysine, 
LsLeucine, Methionine, N«Asparagine , 
P«Proline, QoGlutamine, R«Arginine, 
S-Se rine, ToThreonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=*Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADEDI^KKPbSDIPSSQII^^MTWMi^iiSNl^SPVVLajNDL 
LCKNIIYNEKQGDVQFIDYEYSGYNYLAYDIGNHPNEFAGVSDV 
D YS L YPDRBLQS QWLRAYI*EAYKEFXGFGTBVTE KEVB ILFIQV 
NQFALASHFFWGLWALI QAKYS T IEFDFLGYAI VRFNQ YFKMK P 
BVTALKVPB 


1 5630 


1194 


278 


GFWAIAQTCAHHLPPGSPWLVPASPWRLPBMSSFGYRTLTVALP 
TLICCPGSDBKVFBVHVRPKKLAVEPKGSLEVNCSTTCNQPEVG 
GLETSLDKIIiDEQAQWKHYLVSNISHDTVLQCHFTCSGKQESM 
KSNVSVYQPPRQVILTLQPTLVAVGKSFTIECRVPTVEPLDSIjT 
IiFLFRGNETLHYETFGKAAPAPQEATATFNSTADREDGHRNFS C 
LAVLDLMSRGGNI FHKHS APKMLE I YE P VS DSQMVI IVTVVS VL 
LS LFVTS VLLCF I FGQHLRQQRMGTYG VRAAWRRL PQAFRP 


5631 


1053 


290 


SRVDDFVRPfiPSRAEPSRSGRRRPARRAATMSVF^kLFGAGGGK 
AGKGGPTPQEAIQRLRDTEEMLSKKQEFLEKKIEQELTAAKKHG 

tknkraat^lkrkkryekqiaqiixstlstiefqrealent^ntn 

TEVLKNMGYAAXAMKAAHDNMDIDKVDELMQDIADQQEIiAEEIS 

taiskpvgfgeefdedelmablebleqeeldxnlleisgpetvp 

LPNVPSIALPSKPAKKKEEEDDDMKEIiENWAGSM 


5632 


3 


952 


wi^wspprrlwwgslgaaqrpavpvsglarslhvetrrphrra 
svrvargrlgvwaqpqpllprpvgsrremqppgpppayaptmgd 
ftfvssadaedlsgsiaspdvklnlggdfikestattflrqrgy 
gwllevedddpednkplleeldidlkdiyykircvlmpmpslgf 
nrqwrdmpdfwgplawlffsmislygqfrwswiitiwifgs 
ltipllar\a>3gevaygqvlgvigysllpliviapviiiivvgsfe 
wstliklfgvfwaaysaasllvgeefktkkplliypifllyiy 
flslytgv 


5633 


771 


460 


QG CS FCTMS VGRP FYRSS E FNEQLLS SHUHQVPF FCCFTWCLCN 
CLFENS VS KLYMLCFNFFMS I FFYSLS ITKLNLI YLWGLS YQS L 
LLLLLSGHRPWGSSMV 


5634' 


1446 


855 


PRATGRIRSRAAASRPRAGAGASGABPRSGRER3RIiSGRRAPAM 
ARNTLS SR FRR VD I D E FDENKFVDEQE EAAAAAAEPGPD PS E VD 
GLLRQGDMLRAFHAALRNS PVNTKNQAVKBRAQG WLKVLTNFK 
SSEIEQAVQSLDRNGVDLLMKYIYKGFEXPTENSSAVLLQWHEK 
ALAVGGLGS IIRVLTARKTV 


5635 


3 


• 943 


DRGPRSTATOTORARVSFWRFPI^PGVlWSNVQiSGEKRRFRTL " 
RSLFHP FPVTRSGAPRAVLVGSS WPAKM VAPAVKVARGWSGLAL 
GVRRAVLQLPGLTQVRWSRYSPEFKDPLIDKEYYRKPVEELTEE 
EKYVRELKKTQLIKAAPAGKTSSVFEDPVISKPTNMMMIGGNKV 
LARSLM I QTIjEAVKRKQ FEKYHAAS AE EQAT IERNPYTI FHQAL 
KNCEPM I G1»VP ILKGGRF YQVPVPLPDRRRR FLAMKWM ITECRD 

KKHQRTIiMPEKl^HKLLEAFHl^GPVIKRICHDLHKMAEANRAIA 
HYRWW 


f 5636 




1143 


lk^xcqhppaekklylyhrklrevbrKgiprlpkdvfmdthqg 

LTDVRAKVTGFSEGWDS VKGGFS S FSQATHSAAGAWS KPRE I 
ASLIRNKFGSADNIPNLKDSIjEEGQVDDAGKAIiGVISNFQSSPK 
YGSEEDCS SATSGSVGANSTTGGIAVGASS SKTNTLDMQSSGFD 

allheiqeiretqarleesfetlkehyqrdyslimqtlqeeryr 

CERLEEQLNDLTELHQNE ILNIiKQEIiASMEEKIAYQSY ERARDX 

qealeacqtr i s kmelqqqqqq wqleglenatarnllgkl in i 
lijvvmavllvfvstvancvvplmktrnrtfstlflvvfiaflwk 
hwdalfs yverffss pr 


5637 


946 


2532 


hhhhpqhhlhpgsaaavhpvqqhtssaaaaaaaaaaaaamlnpg 
qqqpyfps papgqapgpaaaa paqvqaaaaatvkahhhqhshh p 
qqqldiepdrpigygafgwwsvtdprdgkrvalkkmpnvfqnr. 
vsckrvfrelkmlcffkhnnvlsaxdilqpphidyfebi yvvte 
lmqsdlhkuvspqplssdhvkvflyqilrglkylhsagilhrd 
i kpgnllvnsncvlki cdfglarveeldesrhmtqewtq yyra 

PEILMGSRHYSNAIDIWSVGCIFAELLGRRILFQAQSPIQQIjDIj 
ITDIJiGTPSLEAMRTACEGAKAHinRGPHKQPSIiPVLYTLSSQA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine r (^Cysteine, DoAspartic Acid, E« 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K-Lysine, 
L^Leucine, MoMethionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VoValine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=.Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








THEAWLLCRMLVFDPYKRISAKDALAHPYLDKGRLRYHTCMCK 
CCFSTSTGRVYTS D FB P VTNPKFDDTPE KNLSS VRQ VKE 1 1 HQ P 
IIiEQQKGNRVPLCXNPQSAAFKSFISSTVAQPSEMPPSPIiVWE 


5638 


125 


1155 


DRKMSEIJDQLRQEAEQLKNQIRDARKACADATliSQITNNIDPVG 
RIQMRTRRTLRGHLAKI YAMHWGTDSRLLVS ASQDGKLI IWDS Y 
TTNKVHAIPLR^SWVMTCAYAPSGNYVACGGLDNICSIYNLKTR 
EGNVRVSRSLAGHTGYLSCCRPLDDNQIVTS SGDTTCALWDIET 
GQQTTTFTGHTGD VMS LSLAPDTRLFVS GACDAS AKL WDVREGM 
CRQTFTGHE S D INAI CFF PNGNAFATGSDDATCRL FDLRADQEL 
KTYSHDNI I CGITSVSFSKSGRLLLAGYDDFNCNVWDALKADRA 
GVLAGHDNRVSCLGVTDDGMAVATGSWDSFLKIWN 


5639 


125. 


1155 


DRKMS E I^QI*RQEAEQLKNQlRIlARKACAl}ATI*SQ I TNNIDP VG 
R I QMRTRRTLRGHLAKI YAMHWGTDSRLLVS AS QDGKLI I WDSY 
TTNKVHAIPLRSSWVMTCAYAPSGNYVACGGLDNI CSIYNLKTR 
EGNVRVSRELAGHTGYLS CCRFLDDNQIVTS SGDTTCALWDI ET 
GQQTTTFTGHTGDVMSLSLAPDTRIiFVSGACDASAKLMDVREGM 
CRQTFTGHESDINAICFFPNGNAFATGSDDATCRIiFDLRADQEL 
MTYSHDNI I CGI TS VSFS KS GRLLLAG YDDFNCNVWDALKADRA 
GVLAGHDNRVSCLGVTDDGMAVATGSWDSFIjKIWN 


5640 

i 


2 BO 


1092 


QQGN KKTMLSHN TMMKQRi^O QATA I MKE VHGNDVDGMDLGKKVS 
IPRDIMLEELSHLSNRGARLFKWRQRRSDKYTFENFQYQSRAQI 
NHS IAMQNGKVDGSNLEGGSQQAPLTPPNTPDPRSPPNPDN IAP 
GYSGPLKEI PPEKFNTTAVPKYYQSPWEQAI SNDPELLBALYPK 
LFKPEGKAELPDYRSFNRVATPFGGFEKASRWVKFKVPDFELLL 
LTD PR FMS FVNPLSGRRS FNRTPKG W I S ENI P I VITTBPTDDTT 
VPESEDL 


5641 * 


27 


332 


CWHNCNGDVKIXSNQMDKLFAFHLFTFHGLLHFLDGSIOKLIQA , 
EIILSDNSSILVLENNFLFKVKSKQFIHLIAKKFYISITIVSAS 
NGES FVLSMIVTG 


5642 


199 


1247 


IT PCRMDFLVL FLF YLAS VLMGLVL I CVCSKTHSLKGLARGGAQ 
IFSCI IPECLQRAMHGLLPTYLFHTRNHTFIVLHLVLQGMVYTEY 
TWEVFGYCQELELSLHYLLLPYLLLGVNLPFFTIiTCGTNPGI IT 
KAKBLLFIiHVYEFDEVMFPKNVRCSTCDLRKPAltSKHCSVCWWC 
VHR FDHHCVWVNNC I GAWN I RYFL I YVLTLTASAATVAI VS TTF 
LVHLVVMSDLYQETYIDDLGHLHVMDTVFLIQYIiFLTFPKIVFM 
LGFWVLS PLIX3G YLLFVLYIAATNQTTNEJ^YRGDWAWCQRC PL 
VAWPPSAEPQVHRNIHSHGLRSNTLQEIFLPAFPCHERKKQE 


5643 


1 


B47 


PSGGVRDVBTRGPGSRAARG PRVVMKRRGVGAGAIAKKKLAEAK 
YKBRGTVLAEDQIiAQMSKQLDM FKTNLEE FASKHKQEIRKNPE F 
RVQ FQDMCATIG VD PLASGKG FWSEMLGVGDFYYELG VQI IEVC 
LALKHRNGGLITLEELHQQVLKGRGKFAQDVSQDDLIRAIKKLK 
ALGTGFGI IPVGGTYLIQS VPAELNMDHTWLQLAEKNGYVTVS 
EIKASLKWETERARQVLEHLLKEGLAWLDLQAPGEAHYWLPAIjF 
TOLYSQE I TAEEAREALP 


5644 


B3 


1138 


PRRMGS W VQL I TS VG VQQNHPG WTVAGQFQE KKR FTEE VI E Y FQ 
KKVSPVBiKILLTSDEAWKRFVRVAELPREEAnALYEALKNLTP 
YVAIEDKDMQQKBQQFREWFLKEFPQ I RWKIQES IERLRVIANE 
I EKVHRGCVIANWS GSTG I LS VTG VMLAP FTAGLSLS ITAAGV 
G LG IAS ATAG I AS S IVENTYTRSAELTASRLTATSTDQLEALRD 
I LHD IT PNVLS FALDFDEATKMI ANDVHTLRRS KATVGRPLIAW 
R YVP INWETLRTRGAPTR I VRKVARNLGKATSGVLVVLDVVNL 
VQDSLDLHKGEKSESAELLRQWAQELEEWLNELTHIHOSLKAG 


5645 


537 




VQSVRDLKRLSPTDPPGDSGNRDVTREDPVTGPLNSASSQVPTL - 
YLOiQNSLLGHSSVEDARATMELYQISQRIRARRGLPRLAVSD 


5646 


3745 


3328 


AEQYGTS PHLLPTMLLSSCLPPANVTTKAATPPPLVLSLTTADP 
AGKPAPCRVTLTLLRAS IPATKRAS FLSS FI KMFFEE LE YTLGF 
LSLLKFHVHVS V YS AI CHFQKEGTGNS RS FTCTPELFP RLQTHL 
RAEGGAQ 


5647 


268 


800 


GVIMATSELSCEVSEENCERREAFWAEWKDLTLSTRPEEGCSLH 
EEDTQRHETYHQQGQCQVLVQRS PWLMMRMGILGRGLQEYQLP Y 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


I Predicted end 

1 IlUCJICOtlQc 

location 

1 conresTi on diner 

to first 
amino acid 
residue of 
amino acid 
j sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysceine. D=Aspartic Acid, E» 
Glutamic Acid, F« Phenyl alanine, G=Glycine , 
nanasciaine, I^lsoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q-Glutamine, R=Arginine, 
SaSerine, Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QR VLPLPI KI'^AKMGATKEEREDTPIQLQELLALBTALGGQCVD 
RQEVAE1 TKQLP PWP VS KPG ALRRSLS RS MS QEAQRG 


564 8 


7 


1518 


VLS 3LCGRHEALRE VQAE WP P P TCS PKI CSGLQQAGNTDWS LTM 
APQS LPS S RMAPLGMLLGLLMAACFTF CLSHQNLKE FALTN P E K 
SSTKETERKETKAEEELDAEVLEVFHPTTIEWQALQPGOAVPAGS 
HVrRLNLQTGEREAKLQyEDKFRKNLKEKRLDINTNTYTSQDLKS 
ALAKFKEGAEMESSKEDKARQAEVKRLFRPIEELKKDFDELNVV 
I ETDMQIM VRL INKFNSS S SS LEE KIAALFD LE YYVHQMDNAQD 
LLS FGGLQ WINGLNSTE PLVKEYAAFVLGAAFSSNP KVQVKAI 
BGGALQKLL VI LATEQPLTAKKKVL FALCS LLRHF P YAQ RQFLK 
LGGLQVLRTLVQEKGTEVLAVRWTLLYDLVTEKMFAEEEAELT 
QEMSPSKLQQYRQVHLLPGLWEQGWCEITVHLLALPEHDAREKV 
LQTLGVLLTTCRDRYRQD PQLGRTLASLQAEYQVLASLELQDGE 
DEGYFQELLGSVNSLLKELR 


5649 


1172 


I 3006 


KLQEQLDAINEEIRMIQEEKESTELRAEEIETRVTSGSMEADNL 
KQLRKRGSIPTSLTDLSLA3ASPPLSGRSTPKLTSRSAAQDLDR 
MGVMTLPSDLRKHRRKLLSPVSREENREDKATIKCETSPPSSPR 
TLRLEKLGHPALSQEEGKSAIiEDQGSNPSSSNSSQDSLHKGAKR 
KGI KS S IGRLFGKKEKGRL IQLSRDGATGH VLLTDSEFSMQE PM 
VPAKLGTQAEKDRRLKKKHQLLEDARRKGMPFAQWDGPTWSWL 
ELWVGMPAWYVAACRANVKSGAIMSALS DTE XQREIGISNALHR 
LKLRLA IQEM VS LTS PSAP PTSRTS SGNVWVTHEBMETLETS TK 
TDSEEGSWAQTLAYGDMNHEWIGNEWLPSU3LPQYRSYFMECLV 
DARMLDHLTKKDLRVHLKMVDSFHRTSIjQYGI MCLKRIiNYDRKE 
LEKRREES QHE I KD VLVWTNDQVVHWVQS IGLRDYAGNLH ES GV 
HGALLALDENFDHNTLALI LQ I PTQNTQARQVMERE FNNLLAIjG 
TDRKLDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 
GFRVSTLGTDQPPPAPPKKIMPEAHSHYLYGHMLSAFRD 


5650 
~ 5651 


1172 


3006 


MLQEQI.DAINEEIRMIQEEKESTELRAEBlETRVTSGSMEAIiNL 
KQLRKRGSIPTSLTDLSLASASPPLSGRSTPKLTSRSAAQDLDR 
MGVMTLPSDLRKHRRKLLSPVSREENREDKATIKCETSPPSSPR 
TLRLEKIX3HPAIiSQEEGKSALEDQGSNPSSSNSS0 n SLHKGAKR 
KG IKS S I GRLFGKKE KGRLIQLS RDGATGH VLLTDS E FSMQEPM f 
V? AKLGTQAEKDRRLKKXHQLLE DARRKGM P FAQWDG PTWS WL 
E LWVGMPAW YVAACRANVKSGAI MS AIjS DT EI QRE X G I SNALHR 
LKLRLAIQEMVSLTS PSAPPTSRTSSGNVWVTHEEMETLETSTK 
TDSEEGSWAQTLAYGDMNHEWIGNEWLPSLGLPQYRSYFMECXiV 
DARMLDHLTKKDLRVHLKMVDS FHRTSLQYGIMCLKRLN YDRKE 
LEKRREES QHE I KDVLVWTNDQWHWVQS IGLRDYAGNLHESG V 
HGALLALDENFDHNTLALILQIPTQ^QARQWIEREFNNLLAIjG 
TDRKLDIX3DDKVFRJIAPSWRKRFRPREHHGRGGMLSASAETLPA 
GFRVSTLGTLQPPPAPPKKIMPEAHSHYLYGHMLSAFRD 


_ 5652 


646 I" 


1869 


ARQGQRQPWG* EARAKG PASES PR V* EGSGWEGPAS P * TPGSTL 
AWGEGAG I R * ASGLTAAGAAS AAAA/ PP PTRGG PAP AG CGRAP P 
WP A P LR VP THG RAPAP RS RAAP RAPALS HGT AAAALS PAS PAGP 
ADP*LPGHSSQSPPRG*RWGRSRSAPAPAHPEHPAPAGSASASQ 
QTPGWPGSCCLAQGWQAEPLGAPGAEDG\PVPPQRGFPLGTLGS 
PAGS WAGLAG YG * AG AP GTQ ATAPRAAGQ T PVAAAPN CR V+ G S A 
PALHRAPAAADPGS PLQAP PRAWAS PAAAG PGLSSSDYCGGLGA 
GWRAGISPBLLGAAGLSDNWARCPGPGPAB *GGQPGCRTIPAS A 
CMPSPPVEGSLGLSRKGHGDLPSQAR*GWHECRRARHLVPLPRIi 
LGPRGRTGRPSS PS 




735 


34* 


HHKKYQHIHQKSFSCPEPACGKSFNFKKHLKEHMKLHSDTRDYI 
CE FCARS FRTS SNLVIHRRI HTGE KPLQCE I CGFTCRQKASLNW 
HQRKHAEXVAALRFPCEFCGKRFEKPDSVAAHRSKSHPALLLA 


5653 


66 


1401 

J. 


RGiUiQSRGRLT]^LVLLLLDII^ARQH6QRVSHGWKGGFLTAPL " 
CFPQPCQ PGTRRGRRRSLKE ATEPQ LAMAEE FVTLKDVGMD FTL 
3DWEQLGLEQGDTFWDTALDNCQDLFLLDPPRPNLTSHPDGSED 
kEPLAGGS PEATS PDVTETKNSPLMEDFFEEGFSQBI /3RDVZQ 
3WLLE LQFRRSLYRGHLVR ♦ FARRSRKS S EV * Y CHQRGKS KGMQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


junin© acia segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I^lsoleucine, K=Lysine, 
L=*Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glut amine, -R=Arginine, 
S=Serine, T=Threonin<» VaVali'no 
W»Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








ES * I KSRTQS CVHR FHGRRFHG \DNVSEKTLTPAKS KE YRGEF F" " 
SYSDHSOODSVOPYTR If DVYV^cprvvcpcr* ovist twtjt.i t mmnn r-> 

PTVHQECEQGPDRKASHSGYPKTHTGYKPYVCNEYGTPFSQSTY 

LWHQKTHAGEKPCKSQDSDHPPSHDTQSGEHQKTHTDSKSYNCN 

ECGKAPTRIFHLTRHQKIHTRKRYECSKCO^TFtTLRKHLIQHQK 
THAANV 


5654 


3 


598 


TLPLFPGRRFRGWRRCGAVAARKNSTGGNVSINQRRDSVRMSAL 
^KPFVYGGLA5ITABCGTFPIDLTKTRPQI0GQTNEAKPKEI I 
x KuniAimu v k j. iJitGlsta L»KAL» YS G * VGLHAFLCHCSXiFHMGIDFR 
PRLHRSQVKSLRCV*KEQIA**/MPSLLISTLISKYIYYAADVL 
EKLFYYIQVQTDNNKKI CLFKNI 


S6SS'- 


2 


867 


rppgiraprqlhpaaqrrpdasArprprptvllhdpfqlsfppp 

PLSYPSVFPAVARVLPQRSGDYRAAGMPQLSGGGGGGGGDPELC 
ATDEMIPFKDEGDPQ\REKIFAEIVNPEEBGDLADIKSSLVNES 
E 1 1 PASNGHEVARQ AQTS Q E P YHD KAREH P DDG KHPDGG L YNKG 
ps» y a s y se Y imm PnMNNDP YMSNGSLS PP I PRTSNKVPWQ PS H 
AVHPLTPLITYSDEHFSPGSHPSHIPSDVNSKQGMSRHPPAPDI 
PTPYPLSPGGGGQITPPLGWQGQP 


" tt5* 


228 


106S 


PRRVPPljPE FASGPGAAFFHSGRLQRSLTKDSAGCFSQCRSRAM " 
LVI/RSGLTKALASRT1APQVCSS FATOPRQYDGTF YEFRTYYLK 
PSNMNAFMENLKKNIHLRTSYSELVGFWSVEFGGRTNKVFHIWK 
YDNFPHRAEVRKAIiANCKEWQEQS 1 1 PNLAR I D KQETE ITYL I P 
WS KLQKP P KEG VYELAVFQMKPGGPALWGDAPE RAINAHVNLGY 
TKVVGVFHTEYGEIiNRVHVLWWNESADSRAAVRHKSHEDPISWG 
GVRBSVNYLWSQQNM 


5657 


105 


1052 

-t 


GQRLQ 3 PRVQMP VQ P PS KDTE EMBAEGDS AABMNGEEEES EEER 
SGSQTESEEESSEMDDEDYERRRSECVSEMLDLEKQFSELKEKL 
FRERLSQLRLRLEE VGAERAPEYTEPI/3GLQRSLKI RIQVAG I Y 
KGFCLDVIRNKYECEIiQGAKQHLESEKLIiLTOTlKJGELQERIQR 
LEEDRQSXjDLSSEWVTODKIiHARGSSRSWDSLPPSKRKKAPLVSG 
PYIVYMLQEIDILEDWTAIKKARAAVSPQKRKSD\DLDPAVHSQ 
GDPQSSWHCTQDSRLPPADRRTHRPLRVCPARLLWCCWALPLHL 
ALVWTPPL 


5658 


23 4 6 




■1 KKKV YNPWPEPDPD \CIQEDP WNLPNS I KTLVDNIQRYVEDGK 
NQLLLALLKCTDTELQLRRDAI FCQALVAAVCTFSEQLLAAIiG Y 
RYNNNGEYEESSRDASRKWIiEQVAATGVLLHCQSLLS PATVKEE 
RTMLEDI WVTLSELDNVTFS FKQLDENYVANTNVFYHIEGSRQA 
LKVIFYLDS YHFS KLPSRLEGGASIjRIiHTALFTKVLENVEGLPS 
PGSO^^DI^DINAQSLEKVO^YYRiaRAFYLERSNLPTDAST 
iAV^iuyijiKPiNA_dJELCRliMKS 

SELCYRLGACQMVMCGTGMQRSTLSVSLEQAAILARSHGLLPKC 
IMQATD I MRKQG PRVE ILAKNLRVKDQMPQGAPRL YRLCQPKMN 
GDL 


5653 


2 


696 


WKRSGEVSPKGELGAWRGNSGRPKIIGRAAEAENBDRTLGRLbP 

GNKRSOPR ^ PT.RTiT.ZXDOT.TTIiT? A V»^T.7\ OTJTiririoor»/->TTr»/~iT^/~i\ 
a tw \f f xv«a tr uiuMjtir WJuAnfijwwtUlsvxWUr V PP PFSSGtlSGPCX 

EREGEGQRGRGRSRRGAHLELKPSPGLRAGAPTDRGRGGPAEVA 
AAGGRRMVQKESQATLEERESELSSNPAASAGASLEPPAAPAPG 
EDNPAGAGG \AAVAG AAGGARRFIiOGVVEG FYGRP WVMEQRKEL 
FRRLQKWELNTYIi 


5660 


229 


853 


PVTMWAFS E LPMPLL INL IVS LLG FVATVTLI PAFRGHF I AARL~~ 
CGQDLNKTSRQQIPESQSVISGAVFLIILFCFIPFPFLMCFVKE 
QRKAFPHHEFVALIGALLAI CCMI FLGFADDVLNIiRWRHKLLLP 
TAASLPIXMVYFxTTFGNTTIVVPKPFRPIIX»LHIJ3IiGR*SYHCC 
P YGT YFRE PFLVLK I LLQVFL FCI*CVFPDP FW 


5661 " 


2 


473 


LNLYPSPCGGIPKLPGLPREAAAAI^FLAEAPLPVTVRGSGL 
AGMAVTCD P KAFLS I C FVTLVFLQLPLAS I CQN* GTDS CAS RG K 
ADFDVTGPHAPILAMAGGHVELQCQLFPNTSAEDMELRWYRCQP 
SLAVHMHERGMDMDGEQKWQYRGRT 


5662 


2 


iai8 


LRKEGRCRRGSNRGWAAPAEGI^GRGMLGVRCLIiRSVRFCSSA 
PFPKHKPSAXLSVRDALGAQNASGERIKIQGWIRSVRSQKEVLF 
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SEQ 
ZD 

KO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rceuiccca ena 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A« Alanine, C=Cysteine, D^Aspartic Acid, En 
Glutamic Acid, FaPhenylalanine, G«Glycine, 
H*Histidine, I=Isoleucine, K=Lysine, 
L»Leucine, M»Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R«=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y-Tyrosine, X-Unknown, *»Stop 

Clad on ynnn<inih1 ^ m^l aftKi'^a <4aiak<«_ 

wuuwu ; /»poaoi»ie nucleotide cexeexon, 
Vpossinle nucleotide insertion) 








LHVNDGSSLESLQVVADSGIjDSRELTFGSSVEVQGQLIKSPSKR" 
QNVELXAEKIKVIGNCDAKDFPIKYKERHPLEVliRQYPHPRCRT 
NVLGS ILRIRSEATAAIHSFFKDSGFVHIHTP 1 1TSNDSEGAGB 
ua rauiujft v yg&Pir r w v fArli 1 V SGQLHliEVMSGAFTQVFT 

fgptfraen sqsrrhlae fymieae i s p vds lqdlmq v i eel fk 
attmMvlskcpedvelchkfiapgqkdrl*hmlknnfliisyte 

AVE ILKQASQNFTFTPEWGADIlRTEHEKYLVKHCGNI pvfvtny 

pltlkpfymrdnedgpqelegsva*hslglmillsivvigqp 


5663 


119 


698 


PADIGRSTAKTPGPPRSLEMDDPRYGMCPLKGAS^CPGAERSLL 
VQSYFEKGPLTFRDVAIEFSLEEWQCLDSAWLYRKVMLENYR 
NLV FLG IALTKP DLITCLE QGKE P WNI KRHEMVAKPPVI CSI LFP 
QDLWAEQDIKDS PQEAI L KKYGKYGHANFQLQKGCKS VD ECKVH 
KEHDNKLNQCLI PKKKK 


5664" 


na 

XXO 


D72 


SIjSMKSNHKSGDGLSGTQKEAALRAIjVQRTGYS lvqbngqrkyg 
GPPPGWDAAPPERGCEIFIGKLPRDLFEDELIPLCEKIGKIYE3V1 
RMMMD FNGNNRG YAFVTFSN KVEAKNAI KQLNNYE IRNGRLLGV 
CASVDNCRLFVGGIPKTKK 




347 


702 


wqhli illhcertspamitselpvlqdstnettahsdagsele " 
etevkg:<rkrgrpgrppstnkkprkspgbksrieagirgagrgr 

ANGHPQQNGEGEPVTLFEWKLGKSAMQRC 


5666— 


213 


540 


VSCLPTS CKMI TLNNQDQP VP FNS S HP DE YKI AAL VF YS C I FI It ' 
GLFVNITALWVFSCTTKKRTTVTIYMMiJVALVDLIFIMTLPFRM 
FYYAXDEWPFGEYFCQILGA 


5*67 


1 


695 


HPLPSASLGLPSVSU3VSLCVRSALLEAWPMLPXRRRWIVGSP 
SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGLARSKGFR 
VLDACSSEATHVVMEETSAEEAVSWQERRMAAAPPGCTPPALIjD 
I S WLTES LGAGQ P VP VE CRHRLBVAG P S KG PLS P AWMP A YACQR 
PTPLTHHNTGLSEALEILAEAAGFBGS EGRLLT FCRAAS VLKAL 


; 566B 


651 


894 


CSFLFCIPDLP^Ll!£RKEEEAVLVGGEWSPSlJXSLDPQADP 
v L> v K i AX KuAUAU IvXlvIiSGCTKlv 


5669 


407 


1 


DS GAPEGLS PLMS TQEGLSMHAHPQAYTPFI YLHARKRRGE IGD 
ADSRFNDRYAHKSAQLYFLYFVCWI FQDVYYFTI KEKNHFFFPK 
ARGAPTKYSGSPIGSPTTTPPTRPPSFNIiHPAPHLLASMQLQKL 
NSQ 


5670 


3 


A / J 


^KUL.rpiAWXPl>iUjPJaLILCTVSVASYEtiAQPSSVSVSPGQTAK 
ITCSGDVLAKKYARWl^KPGOAP^VIYKiyrERPSGIPERFSG 
STSGTTVTLTISGAQVEDEADYFCYSATDNFLWVF 


5671 


280 


MA 
9*54 


Kh fPKKl PPHJbUrtb.i?AJ.TX»WU fi-iLiUijIxbUQKHEHL I CWTSNDGE 
FKLLKAKKVAIGjWGIiRKNKTNMNYD ! 


5672 


2 


557 


FVPATPDPGV^PPSRDPAMAKRSSLYIRIVEGKNLPAKDITGS 
SDPYCI VKVDNEPI IRTATVWKTLCPFWGEEYQVHLPPTFHAVA 
FYVMDEDALSRDDVTGKVCLTRDTIASHPKGKFSLPSHTGLPS P 
WPPSHSETSPLGSVWSPAQGKPFLLSPEAGATFCTPGLCSAACS 
QAWLLLPLP 


5673 


327 


696 


ITVADQ I SHWSAGR I KNRTRI PE CIHSS AATTLAGPHTMEGES V " 
KLSSQTLIQAGDDEXNGJ^TITVNPAHMGKAFKVMNELRSKQLLC 
DVMIVAEDVEIEAHRVVLAACS PYFCAMFTGDMS 


5674 


17 


9B4 


GGGSMEGESTSAVI^GFVLGAIiAFQriLNTDSDTEGFLLGEVKGE 
AKNS ITDSQMDDVBWYTID I QKYT PCYQL FS FYNS SGEVNEQA 
LKlCtLSNVKKlWVQWYKPRRHSDQTMTPRnPTT.T^yQax^FrHFSFQ 
DLVFLLLTPSIITES CSTHRLEHSLYKPQKGLFHRVPLVVANLG 
MSEQIX3YKTVSGSCMSTGFSRAVQTHS SKFFEEDGS LKEVHKIN 
EMYASLQE ELKS I CKKVEDS EQAVDKLVKDVNRLKREI EKRRGA 
QIQAAREKNIQKDPQENI FLCQALRTF FPNSEFLHS CVMSLKID 
MFLKVAVTTTTISM 


5675 


eo 


753 


EGSRRGPTRLARLSARAGRLHFPPGFSSRLIHFRGVSECRRPPG 
KSGVPVSAPGSDGKWWEERPGMFSLMASCCGWFKRWREPVRKVT 
LLMVG LDNAG KTATAXG I QGE YP EDVAPTVG FS KI NLRQGK FE V 
TI FOLGGGIRIRGI WKNYYABS YGVI FWDSSDEBRMEETKEAM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ru " ,Ll ' w ai-j.»u c> cry menu containing signal peptide 
(A=Alanine, C=cysteine, D=Aspartic Acid, E= 

Glutamic Acid PaDhortulalanino /"» PI, ml 

w^uwwiMu ft| »iu, r orneayidianinE, u=uiycin6, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, R*Arginine, 
S=Serine, ToThreonine, V»Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








HKCL 


567* 


2 


930 


PVSS PPPRP VQPARPGG FGLSGRR S LL CQVASTPAHVG VMRS P V 
RDlJUlNDGEESri^TPLLPGAPRAEAAPVCCSARYNLAILAFFG 
FPIVYALRVNIiSVALVDMVDSNTTLEDNRTS XACPEHSAPIKVH 
HNQTG KKYQ WDAETQG W I LGSFFYGYT ITQI PGGYVASKIGGKM 
LLGFG I LGTAVLTL FTP I AADI*G VG PL I VLRALEGLGEG VTFPA 
MHAMWSSWAPPLERSKLLSISYAGAQLGTVISLPLSGIICYyMN 
WT xv FYFFGT IGI FWFI*LW I WI*VS DTPQKHKRISHYEKEYILSS 
L 


5677 


1 


1028 


PPRDGFXELRRLSVPLCSGPCPLTSLSRQGERSG^HI»VAAARAA 
VTAETHPL PLLAPLAVCQS VKS PAACQVRPR PRAVALPAALGGP 
GRSIiPGLTAATMSS FSESALEKKLSBLSNSQQS VQTLS LWLIHH 
RKHAGP I VS VWHRELRKAKSNRKLTFLYLANDVTQNS KRKGPEF 
TREFESVLVDAFSHVAREADEGCKKPLERLLNIWQERSVYGGEF 
IQQLKIiSMEDSKSPPPKATEEKKSLKRTFQQIQEEEDDDYPGQY. 
SPQDPSAGPLLTEELI KALQDLENAASGDATVRQKIASIiPQEVQ 
DVSLLEKITDK3AAERLS KTVDEACLRNRG PGTS 


5678 


3 


593 


6 5 S P P S S T P S Ij PLP F YL LLGQLRLQLL WGTAHLSGAG EAAPCPG 
GSGRTAAPRTRADPAAQSl^MIMNKMKNFKRRFSLSVPRTETIEE 
S LAE FTEQFNQLHNRRNEMLQLGPLG RDP PQECS TFS PTDSGEE 
PGQLiS PGVQFQRRQNQRRFSME VKASGALPRQVAGCTHKGvKRR 
AAALQPDFDVSKRLSLPMDI 


5579 


2 


623 


LNSRVDDFVAVPGAIMDEDYYGSAAEWGDEADGGQQEDDSGEGE 
DDAEVQQECLHKFSTRDYI MEPS I FNTLKRYFQAGGS PENVIQL 
LSENYTAV7\QTVmiLAEWLIC^GVEPviaVQETVE^mLKSLLIKH 
FDPRKADSI FTEEGETPAWLEQMIAHTTWRDLFYKLAEAHPDCI* 
MLNFTVKVGRVLELRRKVFMNVYFWLLVCFL 


5680 


258 




RRLTS TSEKLQNRNSHTPLESLIHPQP SYKG FG I MFGKKKKKX E 
ISGPSNFEHRVHTGFDPQEQKFTGLPQQWHSLLADTANRPKPMV 
DPS C I T PIQLAPMKT IVRGNKPC 


5681 

z 


45 


869 
r 


LIX^TIiGv^TICESQABGYNRSGlNNHQAEDPRFCPSFCV/MRSA " 
RQTRPQRLRKEAARPPTPGSCPGGTGMDGKKCSVWMFLPI*VFTL 
tTSAGIiWI VYFIAVKDDKILPLNSAERKPGVKHAPYIS IAGDDP 
PAS CVFSQVMNMAAFLALVVAVLRFI QLKP KVLNPWLNI SGLVA 
LCLAS FGMTLLGNFQLTNDEEIHNVGTSLTFGFGTLTCWIQAMi 
TLKVNI KNEGRRVGIPRVILSASITLCVGPLLHPHGPKHPHVCS 
QGPVGPGHVL 




39 


622 


PSRS CLGTMRKWRHREVNLP E VTQQDAVCPAP IPS PGLSAQTGL"" 

PFYNGFYYSNSANDQNLGNGHGKDLLNGVKLVVBTPEETLFTYQ 
GASVILPCRYRYEPALVSPRRVRVKWWKLSENGAPEKDVLVAIG 
LRHRS FGD YQGRVHLRQD 


■5683 " 


39 


T78 


i /uji inuiifindvjuJidtUiAnAT 1 TUXTIJKVAPKIJIADMQRA 
HYKTDWHR YNIJlRKv7ASMAP VTAEGFQERVRAQRAVAEEBS KGS 
ATY CTVC S KKFAS FNAYBNHL KS RRH VE hE KKAVQAVNRKVEMM 
NEKNLE KGLGVD S VD KEAMNAAI QQ AI KAQPSMS P KKAP P APAK 

SSDEEHDLC 


5684 


19* 


677 


TWCFRGYLC^PkVl'M^CALbE^ PPYIjTVgITDVsAKYRGAFCEAKI KT 
AKRLVKVKVTFRHDSSTVE VQDDHI KG PLKVGAI VEVKNLDGAY 
QEAVINKLTDASWYTWFDDGDEKTLRRSSLCLKGERHFAESET 
LDQLPLTNPEHFGTPVIGKKTNRGRRYE 


£685 ' 


779 


1262 


IiLOXJPVVHCFLLFPPFRFSHHMIPGPPGPHTTGIPHPAtVtPQ 
VKQEHPHTDSDLMHVKPQHEQRKEQEPKRPHIKKPLNAFMLYMK 
EMRANWAECTLKESAAINQI liGRRWHALS REEQAKYYELARKE 
RQLHMQLYPGWSARDNYVS PSSI P V7\I*HS 


568? " 


128 


1181 


CTWWQVNITLL^INDNHPTWKDAPYYINLVEMTPPDSDVTTVVA 
VT^PDLGEiraTLVYSIOPPNKFYSIiNSTTGKIRTTHAMLDREWPD 
PHEAELMRKIWSVTDCERPPLKATSSATVTVNLLDLNDNDPTF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 

nucleotide 

location 

corresponding 

to first 
1 amino acid 

residue of 

amino acid 

sequence 


o^yuiKSi.iu ugotaining signed peptide 
"<A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HnHistidine, I=Isoleucine, K-Lyeine, 
L»Leucine, M=Methionine, N-Asparagine , 
P=»Proline, Q«Glutamine. R=*Arainine 
S»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








QNLPFVAEVlegipaGVS I YQWAt DtDEGLNGLVSYRMP VGMP 
RMDFLINSSSGVWTTTELDRBRIAEYQLRWASDAGTPTKSST 
STLTIHVLDVNDETPTFFPAVYNVflVSEDVPR\GSGMSG*AARN 
NDVGLNAELSYFITGGNVDGKPSVGYRDAVVPTVvrsT to> ri>nt » 

YMLI LEA I DNG P VG KRHTGTAT VFVTVLD VNDKRPI ILQS S YV 1 


5687 


17 




aappappdg/ppp/pppappt/'pgpaVapas^cqprlsagraaH 

QGDGG AAAVGHVL WPAVGPVRVNPGLQTPVPRPELLPG P \S S S 
LHSDSSYPPDAGLSDDEF'DPnjvCT DDnnoor *i«t/t» / t\t\t\ /numm 

SGCRMPSTSASE/AAGGQGACTHAKGSBTPPPASPQTSEPAPSP 
LPPHLTGaPGMYSSEAKLPNSFSCIiGLAGTGAGI * GTASAHGTG 
PPVLPHVCTPSLANPQP\AVGPBASSLPLGVSGIGMSA/SAPIS 
^PFVAIGSCWLRGIPPPGSGFLCPGRAPGPVPITTHGQEGQGP 


5688 


1 


j 420 


' IiTKWDLFGNCYRLLKTd^EHGAMPEQVGVYWYS/CLYDSRKLFF | 
*SHMI IRSLL* KVIDDSLGQLPLLRELLL* *LNVTDRC1 ILAYV 
LRVEKTFAITYLKNFTVKVDFSLLGE I PLI SMAAILKLW I MKID 
DGYIPAVF 


5689 


1504 


1 3 


HEI^GKHISMVSGNTCNWHPGGHSPGGGGQGBITSKDRGBlPAlH 

IWA/RK?IGTWTATKPTHRAG*GGAEEYQPPPQPCBGPRSTSRG 

L»tij'uttftV(jFGREIGKBGSLPFLGPKALGF*SASCQRAFEGGAH J 

GSTARKPAPATPGTRHPRTMETREVAQGWPAGPRSQFWDQHPHS 

PGEHRPSG\SPLPACPPRAWPKAGAVASATGTG\PQLPGSRGKO 

KLPRTREPPLLQAGWAVRKPPWSEAKEGLGQAGRPSGMDSSAS\ 

PQTPGGRGSLBWGLPLYLGPHHDVK*RSDRIjG*PP*GGQGGGGH 

gapstpgpggeaw*lpqqtsrpkpgpqay*ge\gspglqcpcsk 

EIj*RVPPGSLGPSTQCKYEPTDKHS\GGAI>AQLEVSTAGSRSTF 

gqelkgpldagrlwpgapsassshr*gg*eraragaghrgst*a 
sskieqgrprpgptsdaladveggabs /g phpwplpgtlpnr/ p 
gspppa*asagrkgtvstlgggll 


5690 


1424 


58 


PSPPAGVCMPAPliPLIJUiARRDPJlPCSPGAE»UVPWQTG<3PAIDH 
GAWRTS VSALRRGATG/ APCS PGAEAAPWQTGGPAIDG\DGELP 

vi^tivAt'KutAjAEWGpasGPVRRPGAGRGAHAGQGRQQDPEP 1 
DGLRHRQHGAASHARHRLQRLRPGHHQNRHVRRDPQAPPGGPAP" 
GHAAALPERTRGVAEPPAWAHAGSDAWRAGR*SQRT*ERARPRH 
PTFQGRAGS \GQPGYQ PPNPHPGPSS PPAAP\GPRGA*GNPQLE 

kakrpe i PDGGPPSPAGSSASASTFRCTS I 
SLSLLG P/PGAHNLDTAPQDR* HGP*GDKRGAPGVAGEDPRPP * 
GNFVR * LLLMP/GVA* RHGTS PFLGPSLGENGGQWDSGNDFGTP 
KG * SHPAFTKST * SMEAEKS YWNH PHR \ DRGRQGVR INCLRVGE 
S EMWGP YSAPRPGTVFLSS FLS PASEEH\ PEGSSSFNTPFPPAG 
PEGDPGLNSPGLLP I 


5691 


107 


550 


ISNDPSPGYNIEQMAKRGKKLVEIiPYTVKGMDVSFSGZLSPlB^H 
VAHRMLATGECTPEDLCFSLQVMQ*KTGTESWG*RFYIVEQN*S 
GDAPL I FS P YLSLTGNCG FAMLVE ITERAMAH \ CGS PGG PSLWG 
GVGVYVLLESVPLSYS J 


5692 • " 


1193 


548 


TQAWTRAEKDRKGSVRAI^HLERGPPT*RGSHPL\QSVPCJQ"K| 
PSIFSSYPI/GLPQSGGEPGPVGEQQPVRRPEQPSCGPASHMPI, 
TSRS VPPGRGALPPDS LSTRKGLPR PSTAGHRVRESGHKVPVSQ 
RLNLPVMGATRSNLQP PRKVAVPGPTR* RDQDS KQDFSS KPLQS 
VPGXJVSTQQTLTPADSGPGTGGRDATRAGLPGVE'mGNGVD | 


5693 " 


1258 


1330 


ALTWPVPJCGTTWWAQPHGCSNLVSRARLDLSSRPSQNTEPQAP | 
*QAGPPSSLRPP\SRRR*APEWPKRATGSRCRGLSAPPWPWPAA 
RGE/PGSAPSHAP/PNSPRPSGTRHP/PGPSSRVLYSPSLPRNS 
PEAIVWRSSRFPLWFPLRCCFWVSGFKDPNPVLRFF | 


5694 


3 


1338 


ss keparslhrrgsghkssagkwgsvtj^tagalg*kqlhq*wtH 

3RCL\NN!LSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
5LAESGLSWFSESEBKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDSSKGGELKKPISLGHPGSLKKGKTPPVAV7SPITHTAQSAL 
<VAGKPTCKATDKGKLAVK?JTGIiQR5 S SDAGRDRLSDAK KP PSG 
EARPSTSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGXPV 



360 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


acgmcnt containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H*Histidine, I=Isoleucine, K=Lysine, 
L*Leucine, M«Methionine, N*Asparagine, 
P=Proline, Q=Glutamine. R=Arainine 
SsSerine, ^Threonine, V»Valine, 
W-Tryptophan, Y= Tyrosine, X^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\°poasible nucleotide insertion) 








KPVNGRKTSJjDVSNSAEPGFLAPGARSN IQ YR^ijPRPAKSSSMS 
VTGGRGGPRP VSSS I DPSLLSTKQGGLTPSRIKEPTKVASGRTT 
PAPVNQTDREfG2KAKAJCAVALDSDNISLKS IGS PES TP KNQASH 

PTATKLAELPPTPLRATAKSFVTCPPSLANLDKVNSNS LDLPSSS 
DTTQCI 


5695 


> 3 


1336 


GS KE PARS IiHRRGSGH KSS AGKWG S VTLSTAGALG* KQLHQ * WT 

OR vT? A MNT>^ ^FFFMACCTCT MCT T% Otirvrn pnnttrtinTirr — ^ 
V ' N - V " J \ vult ajj JZiCtE Ai/\o o J LuN jLitrtil KlAS R it M S X T "IjRTDS E KR 

SLAESGLSWPSBSBEKAPKKLEYDSGSLKMEPGTSKWRRERPBS 
CDDS S KGGELKICP 1 S LGHPGS LKKGKTP PVAVTS P I THTAQSAL 
KVAGKPEGKATOKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
IARPSTSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGIPV 

kpvngrktsldvsksaepgplapgarsniqyrs lprpaks ssms 

VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 

PAPVNQTDREKEKAKAKAVALDSDNI5LKS1GSPESTPKNQASH 

PTATKLABLPPrPLRATAKSFVKPPSLANLDKVNSNSliDLPSSS 
DTTQCI 


5696 


3 


1338 


GS KE PARS LHRS.GSGHKSSAGKWGS VTLSTAGALG* KQLHQ * WT 
QRCL\NNLSSE2FNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 

slaesglswfseseekapkkleydsgslkmbpgtskwrrerpes 

CDDSS KGGELKKPISLGHPGSLKKGKTP PVAVTS PITHTAQSAL 
KVAGKPEGKATDKGKIiAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
IARPSTSG S FG YKK P P PATGTATVMQTGGS ATLS KI QKS SGI PV 
KP VNGRKTS LDVSNSAE PG FLAPG ARSNIQ YRS L PRPAKS SSMS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDREKEKAKAKAVALDSDNISLKSIGSPESTPKNQASH 
PTATXLAELPPTPLRATAKS FVKP PSLAMLDKVNSNSLDLPS SS 
DTTQCI 


5697 


1147 


47 

r 


PS BALS PPACP SAP APRRS 1 1 SRL FGTS P ATEAAP PPPEP VPAA 
QGPATVQSVBDFVPDDRLDRS FLBDTTPARDEKKVGAXAAQQDS 
DSDGEALGGNPMVAGFQDDVDLEDQPRGSPPLPAGPVPSQDITL 
SSEBEAEVAAPTKGPAPAPQQCSEPETKWSSIPASKPRRGTAPT 
RTAAPPWPGGVSVRTGPEKRSSTRPPAEMEPGKGEQASSSESDP 
a^tr±j^v^nu^jr vi^oPUFBSEGSDTQRRADDFPVRDDPSDVTDE 
DEGPAEPPPPPKLPLPAFRLKNDSDLFGLGLEEAGPKESSEEGK 
EGKTPSI03NKK1GKKKGXEEEEKAAKKKSKHKKSKI)KEE<3KEERR 
RRQQRPPRSRERTAA 


5698 


2 


666 ^~ 


GAEAAEPQEDLPPLSQSSRFFQEQQKMNKSLGPVSFKDVAVD7T 

QBEWQQLDPEQKITYRDVMLENYSNLVS VGYH I IKPDVISKLEQ 

GEEPWI VEGEFLLQS YPDEVWQTDDLI ERIQEEENKP5RQTVPI 

ETLI*R/ERGNVPGNTFDVETNPVPSRKIAYTHSLCNSCER\GF 

NASSEYISSDGRYARMKADECSGCX3KSLLHIKLEKTHPGDGAVE 
FNQ 


5^9 


2 


1448 


rvrqppglwvrrtvpamqcpaglsrvpgvag/dpslpsfrgprd"" 

EAAHRGTIQTARHTRKLYVQGPASGPPLPRVSTQVAI*DEKPLA 

RPS/GRTNAPPPn/^nifDaf2WBaDrtin»»»r*T>ir»»iLirB\. »frur«««w w . #* 
*« *-V wivj.iiLfir'c r\^yi\jr'/^r\>4i\trljP/\AAUKVAMR \PGHPGLLAS 

DSQRSSSXGSGWETPVPWS*AQPGWVSGLLLLGDPSGPGSL*RS 
TWLVGGARGPEGSGVRGSGVfPSGCSDIGWAIiAGWNHS 'HLDPNT 
WTQKWTGE / S PAPGEEG\ VAPAP RG PT APttf»un?T .n-w o nvovm 

VPILFOWSGAliRSRRTEPAGWVPPTRHE+DDG^TAAPASGGAP 

VSTPTWAGTP/LNASLGPTDPQGKPGCRPPCALPKPAGPERSA* 

GGSLGCR/ SMLPASSGPPPAPGPRRLAAGAHTSASARCPPaAAA 

GWQPRR PGFAGRAALPGPPHP PSS * RELGGLPGPGW*TLDPLPA 

HPAHPPGSAPPWGALGGWAAARASLPWSPSLCLSFPAVTPVAGL 
FPPGRG 


5700 


923 


597 j 


NGHKGVWE INI Y * RRSN I HKNS KSE S HLNQDHS FP P PTPNS ARS" 
KLHSTGTAKNTGLPLSGAPRQRAVFSGRTI CQEFSSCLQCAYLD 
E*CSIASSLIKAILRVSVLSE 


5701 


59 

_ 


410 

] 


IFEKICSDTQEFISPEINPQICSWttFDKGAK/NHATGKDSLFN 
KWSWKNWLSTCR*MRPGPYFTPYTKINSK*IK/DANIRCETVKL 
jEENTGENLHDTGLGNVFLDMTPKTQPTKQK 
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WO: 
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Predicted end 
nucleotide 
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corresponding 
to first 
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sequence 


AminO acid BSOmen t mnhnin^nn oirmal vvonf- -J 3J _ — 

(A=Alanine, OCysteine, D-Asparzic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine. M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S*Serine, T=Threonine. VxValine, 
W=Tryptophan, Y -Tyrosine, X -Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 


5702 


. 3 


1517 


ETFVDPSQCGGIPSDSPHPVITPSRASESSASSDGPHPVITPSR 
ASESSASSDGPHPVITPSRASESSASSDGLHPVITPSRASESSA 
SSDGPHPVITPSRASESSASSDGPHPVITPSRASESSASSDGLH 
PVI TPSRAS ES S AS SDOPHP VI TPSWSPGSDVTLLAEALVTVTN 
IEVINCSITEIETTTSS IPGASDTDLIPTEGVKASSTSDPPALP 
DSTEAKPHITEVTASAETLSTAGTTESAAPHATVGTPLPTNSAT 
EREVTAPGATTLSGALVTVSRNPLEETSALSVETPSyVKVSGAA 
PVS I BAGS AVG ICTTS FAGSSASS YSPSEAALKNFTPS ETLXMDI 
TTKGPFPTSRDPLPSVPPTTTNSSRGTNSTLAKITTSAKTTMKP 
PTATPTTARTRPTT\A*VQVKMEVSSSCG*VWLPRKTSLTPEWQ 
KG+CSSSTGNSTPTRLTSRSPYCVSGEANG/PSAAARHVPYAKR 
GCCP* PGPPPTDCSCVTVLRGTQKVPMKGSMSKPLTPDVATCPS 
LTSTG VYVWGGAS P V PPRVT •fiT.TT.aui/T.r' va w irr 


iJ70* 


14 


1117 


HHKDSRSQGIiPRTQBGARPELRPIiLCPRALWPVTRLSYRCPWQA " 
PKAGIGTKAKPSESHLKLHPGWPSLDRQGEPATLGTGTGHCSDS 
RILRWHP*HTAAR*PRWRRLPSSHRWTRHLGVIiRVQDKS**VSL 
DPSCRPRFLRTC* * YGMRSVASS SNP PPGWSGPGASVFPARPVS 

AIiPTGPRCW^APPfSDTDnDrY^UDDT.CODlJJlTXtM.'tr'TV^/tTiT n rt r-»i-k 
***** Avairi\^.n t\rt\\Jti 1 Ir l_ljn l^KlaoolrriA i AUW(jr*GCPljo PSR 

GSWETAPGS * WCPWL* AARWTGWRTASGAS AGLGRAADRPSAWA 
RRVAGLLPGQGLTVRR*H*TAGAPASVRSSQGATRSPAPGGDQC 
ACGRGPGSC+HPPPWPVSPSSPVPCPSGR*HLRGPLLSAARPRA 
AGWPRHSPHDTQTPBP 


5704 


23 


562 


wvauc as«?c i nuyio^AAMJuv icvuilH VaUIA/iCi 1 Aft Vt A T SHEW I 

SGNAASDKNIKDGVCAQIEKNFARAKWKKAVRVTTLMKRLRAPF. 
QS STAAAQSASATDTATPGAAGGATAAAASGATSAPEGDAARAA 
KS DNV APRR P * LPPQPQME VPPQPLMAVS PQPP MEASLQ PLMGE 
SPQP 


5705 


23 


562 


SGNAAS DKN I KDGVCAQ I EKNFARAKWKKAVRVTTLM KRLRAPE 
QSS TAAAQS ASATDTAT PG AAGGATAAAASGATS APEGDAARAA 

KSDNVAP1UIP*LPPQPQMEVPPQPLMAV5PQPPMEASLQPLMGE 
SPQP 


5705 


1161 


610 

•r 


QLGRFXAQDT VAT P KilV e v vciTr a mp vnn> t t prpuyun^pfinAr t> — 
D YVANTDN CS LKDL VRE CERR YCAFNNWGS VE E QRQQQ AELLA V 
IERLGREREGSFHSNDLFLDAQLLQRTGAGACQEDYRQyQAKVE 
WQVEKHKQELRENESNVfAYKALLRVKHLMLLHYE I FVFLLLCS I 
IiFFIIFLF 


5707 


28 


609 


GSPAPTPGFRRRPGRGTPSPGXRHKQGRAEPEPDAPERAPLRR* 
MFA I Q PGLAEG GQ FLGD P P PG LCQ PE LQ P D SKS NFMASAKDANE 
NWH GM PGRVEP I LRRS S S ES P SDN QAFQAPG S PE EGVRS PPEGA 
EIPGAEPEKMGGAGTVCSPLEDNGYASSSLS IDSRS SSPBPACG 
TPRGPGPPDPLLPSVAQA 


5708 


44 


1925 


SFSWEETISPCFPKMPAEPWWLSPVSLGAAGWPGQPRPYIiDIjPA - 
QASVSRPHDRA*GEAVSLSLSSGDVCGHTDGGGAGSDPQAKPKP 
PRCPFTAMPSPRTKQKVRNKVCLLIAIRYSDIPSDVSKAP \gpa 
GNPHDRSSTAA+LHRRAGAGSLCLSASLLPPSFSLGAPGAPSPL 
RVSPASGGPRKEGRQGSGG *AGGGGP \arthadlpcvgfvcs pp 
I*LiC*SDS PVK0LPA\ SGOGSGASMPPVGSSDTT.P pp prcvcnTr* 

RAAG*CSWQPAACCTPRSQ*WAVARSPSRCSRW*RQSGR*RG*S 

srrrrgp*aagrstpavp*pcs*ggagrrayacrtgwgyapsr* 
lepsgptsgsal*twashstga* *srlcgtagtgplcsqssrs * 
ag*rccctaaspcx3gsgpshpgspsahciiswsggrtqprapsah 

GRGRAMGSRCVCTCTGL PCPG I PLS GAS PGGSG ETGAGRSHTLK 
AARSRLSPRPGSGSRGSY*SHNDNWGTWPAPPSAGHLLVGG+NS 
QRTSSDH*YTGTRRPWAGPGTRCSTAPSRAAPPVSRCRPPPPPP 
PPRPPRLPAAAS/SGGASGSPAASCSCSCRAPAKPASS/GEAPA 
PPPRPEPPPPPARRP 


5709 


2 


2031 


ITLCPLPQTEKCLNVVTEAATPLGIYLKARVEAGGLKELEISWG 
IiHQIWRWGAWMRAGMGGCROffGVMAPFAPR/NALS FLVNDCS 
LIHNNVCMAAVFVDRACTWKLGGLDYMYSAQGWGGGPPRKGIPE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


AmXTlO acid AMntPn h nnm hai rti-r a i rrn n 1 nan t> J J_ 

(A=Alanine, C« Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, Fo Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M«Methionine, NaAsparagine , 
paProline, Q=Glutamine, RsArginine, 
S=Serine, T=Threonine, V=» Valine, 
W=Tryptophan, YoTyrosine, X-Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








IjEQYDPPEIiADSSGRWREKRSADMt^UjGCLIWBVFNGPLPRAA 
ALRNPGKIPKTLVPHYCELVGANPKVRPNPARFLQNCRAPGGFM 
SNRFVETNLFLEBIOIKEPABKQKPFQELSKSLDAFPBDFCRHK 
vuryuuAAc as vnnunv v ux true ltv\9JvjrJUs/U5JSX\|^lvX J fv V VK 
^SSTDRAMRIRLLQQMEQFIQYLDEPTVNTQIPPHVVHGFIiDT 
NPAIREQTVKSMLIiLAPKLNEANLWELMKhT^ I 
RCNTTVCLGK1GSYLSASTRHRVLT3AFSRATEIDPFAPSRVAGV 
LGFAATHNL YSMNDCAQK I LPVLCGLTVDPB KS VRDQAF KAI RS . 

VSS LTS KL I RSHPTTAPTETNI PQRPTPEGVPAPAPTPVP ATPT 
TSGHW ETQSED KDTAEDS STADRWDDEDWGS LEQB AJSS VLAQQD 
DWSTGGQVS RASQVS \TPTTNPPNP QS PTGAAGK\ RGLLGTG LA 
GAKLPGATS * R YTAGOP V 


5710 


1 


562 


I PG S T I S CE VELMARMAKT I DSFTQNQTRL WI I DGLDACEQDK 
VLQMLDTVRVLFSKGPFIAI FASDPHI I IKAINQNLNSVPSGFK 
\LNGHD YMRNI VHLPVFLNSRGL/RQ/IjQENPS * LQQQMBTFHA 
Q I LQG YRKKLTEEFHRTAI*GR*QNLVARQPS IDG*DAIG FELYV 
CIAIQFNTNKDDAT 


5711 


1526 


1130 


RRHPFQWTTVTQEAFSHHDVAFTSTPVLFYPDSAQPFIVKSESS 
SQIAKAVI<SQQRPSLFHECAFHFFS*SLQRHTINLDOGIP*LIjM 
LSEBRQHLFESS / I WTTPHNLK* / FEIHEHLGSHEGHWTLFFLL 
QIL 


5712 


3 


1391 


GRKLFQSI^ISERI*KFliLTLDCVDDTLI VLAEBHGCLDI I KELP 
BtlvX IJI »I iNKv^TFHPSKRPTPDEIMKDKVFSE VSPLYTPPTKPA 
SLFSSSLRCADLTLPEDISQLCKDINNDYLAERS IEEVYYLWCL 
AGGDLEKELVNKEI IRSKPPICTLPNFLFEDGES FGQGRDRSS / 

NELSAAATIiPLIIREKDTEYQLNRI ILFDRLLKAYPYKKNQIWK 
EARVD I PP LMRGLTWAALLGVEGAIHAK YDAIDKDT P I PTDRQ I 
EVDI PRCHQYDIOLSSPEGHAKFRRVLKAWVVSHPDLVYWQGLD 

LTVFS QM I AFH D P E LSNHLNE IGF I PDL YAI PW FLTM FTHVFPL 
HKI FHLW\ DTLLLGEFLFPILYWB 


5713 - 


634 


284 


P VCAVP VDRW P VL PREDQEGQQL* AKLPRDFRR * FQ I LG PMEGH 
TACRCSRRGAQVQHLPRBD1RAAE*DPHLREVWPGLPTSSATS p 
* RAVLTS PCSHLGS ADAA9SHWLCGVS FH 


5714 


212 


613 


WGLGliGPTMSSLGGGSQIlAGGSSSSSTNGSGGSGSSGPKAGAAb ' 
KS AVVAAAAPAS VADDTP PPERRNKS G I ISEPLNKSLRRSRPLS 
H YSS FG S S GG SGGG S MMGGES AD KAT AAAAAAS LLANGHDLAAA 
MA 


571* 


131 


1979 


ESASO^KRSKaJILTLKLELSGSAPKKTSARPGSSLWLPPHSQE 

qtppasklcggggglqlx3w01jipvpvtaasplprwclfgavak\ 
glpgp*lcpsgaa/gglorgpglspiigaagkvsclhppsmvenn 
dstchehhegilaarvtpvp\sgkpgrvlkppgrvcrpphpaas 
prppgs / sdldgprpqmhlrafpaahggpvntphggeektfmss 

OIRRKETKPL*RKTPAG\NNVrtQMQTPVQnCDnT tutjt r dc&pd 
TQAPSGRG DAGKPTPGHG \LP KASVT LTPNCPCS LAGGQ * PPGL 
YPKTPKQRRWRRPL/LLGPSQ*GSRQSTC+EV\GALGBPVRIPG 
L* PDLS CILSNGSKHRREGLSFPRSIjGPGRRfiPAGLnST .nPQ dt 
PKNTACHSSGHVALQAGHDSARDVGSGHVALQAGHDSTQDVGRP 
VWRWI PLE * LGLSRETGQATRRGL VW IS PGRAAAACVACAQALE 
EGPLRLPGQDRGAQPCSHCPGRAAGQPBPGAGAPCRE/GG * DPT 
GLT/GVPGTDPKRGGRKPGQSGQETQGPT VWSGPESPLQPKP * E 
RQE/VGAGASSGVGLSRGRAGGPSSAWBVAAMLLLLRHGSHSEL 
TDLTEAQTSQH 


5716" " 


1711 


1370 


RVFS LLCEGPGHCYQGAVCRBACAAASPGLDSAAEPHRLCEHTD 
*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF*GVARYAC* 
RCPLVL* SGFFTI IVGGYS CCMPLKT 


5717 ■ 


44 


1489 


LPTEALRESEWVSEYGKCX3PRGLVPEGESTSPLPSSVDTEDSLD 
EGPGALVLESDLLLGQDLEFEEEEEEEEGDGNSDQLMGFERDSE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
1 nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
I sequence 


Amino acid segment containina sianai np>r>r-n 
(A«Alanine, OCysteine, CfeAspartic Acid, B= 
Glutamic Acid, F-Phenylalanine, G»Glycine, 
H=Histidine, I=IsoleucinB, K-Lysine, 
LoLeucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R«Arginine, 
SoSerine, T«Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=poseible nucleotide deletion, 
\=possible nucleotide insertion) 


571B 


125 




GDSUGARPGLPYGLSDDESGGGRALSABSEVBEPARGPGSARGE 
RPGPACQLOSGPTGEGPCOGAGGPGGGPLIiPPRLIjYSCRLCTFV 
SHYSSHLKRHMQTHSGEKPFRCGRCPYASAQLVNLTRHTRTHTG 
EKPYRCPHCPFACSSLGNLRRHQRTHAGPPTPPCPTCGFRCCTP 
RPARPPSPTEQEGAVPRRPJSDALLIiPDLSLHVPPGGAS FLPDCG 
Q\CGVKGRASAGLDQNHCQS/SLFPWTCRGCGQELEEGEGSRLG 
AAMCGRCMRGEAGGGASGGPQGPSDKGFACSLCPFATHYPNHLA 
RHMlCl^GEKPFRCARCPYASJuILDNIiKRHQRVHTGBKPYKCPL 
CP YACGNLANLKRKGR1HSGDKPFRCSLCNYS CNQSMNLIRHM 






i 284 


VAHALSLPAES YGNDVSMTHPQL PPTQLAWDLCRTCL PLS YNFT 
S**STADPLHL 


5719 


46 


1 428 


EfcNNGPFQMPU^GGI^VTGSWADRSPIJ^ 
LSQGYNWAVTLDHVTPLHEACLGDHVACARTLLEAGANVNAIT 
I DG VTPLFNACS QGS PS CAELLLEYGAXAQP \ p«? d, p <5 p 


5720 
"S721 


1 


! 1051 


LQAFRNASKVPMVLVGTQDAISAANNPRVYRRTSRARKLSTDLK 

\rct\yye\tcggtyglqmwsvsfqdvaqkwal\rkkqq\lai 

GPCXVSLPK\SPSH\SAVSAAS T PAD&PTVOruc /cr>nr>e* t»e^ 

Y\SSSVPSTPSISQRELRIETIAA5STPTPIRKQSKRRSNIFTS 
RKGADP\DRE KKAAGCKVDS IGSGRAIPI KQGI llkrsgkslnk 

ewkkkyvtlcdnglltyhpsrihdymq?7jhgkbi dllrttvkvpg 
krlpratpatapgtspranglsversntqlgggtgaphsassas 
lhserplsssawagprpeglhqrscsvssadqwsea'ttslppgm 

QHPASG 




97 


492 


RHSSPCCSLKRTBRSSNAAVST/TTVQQFKRFIENYRRHIGCVA 
VFYAI AGGLFLERAYYYAFAAHHTGI TDTTRVG TTT, ejprrra act 

SFMFSYXLLTMCRNLITFIiRETFLNRYVPFDAAVDFHRLIASTA 


5722 


88 


1043 


VAL D VLAGS 5 PGGGMAGALLG PR VHG I RAVXR VARGGVQAPG AP 
GSLGVSHAAAPPARPQGJVAQS PHRGRRHGGGGAGLPPPRS PR FP 
QESVPASTSTARGPRRVSRRLPPQHPGPRGRRRRPGAGVGAPRR 
GRARGQAGLLGRQGQGGRGAE RERAALQ ARRGRR PGPEPDQS CG 
GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 
PP PP PHLGALTAGSGEERQ S Q PRAETLRJU3RGAPLP \ PRAERGG 
RPKQAEQQONPKRPTPPARGPn^^finpaMT.ooDA^T DTrrr 
KSSTREIPEMI 


5723 


88 


1043 


VAL DVLAG Sis PGGGMAGALLG P R VHC> I RAVLR VARGG VQAPGA P 
GSLGVSHAAAPPARPQGAAQSPHRGRRHGGGGAGLPPPRSPRFP 
QESVPASTSTARGPRRVSRRLP PQHPGPRGRRR RPGA3VGAPRR 
GPJUIGQAGIMRQGO^RGAERERAALQARRGRRPGPEPDQSCG 
GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 
PPPPPHLGALTAGSGEERQSQPRAETLRLGRGAPIiP\ PRAERGG 
RPKQAEQQQ\PKRPTPPARGPQSSGDPAMLPQRAGLRTGGLAGT 
KSSTREIPEMI 


5724 


3 


1841 


FTNEAPPAPiiPIJ^ASPI^PHRPJUCSlaJRRSTEPSVTPDLLNFK ~ 

KGWLTKQ YEDGQWKKHWFALADQSLRYYIIDSVABEAADLDGE ID 

LS AC YDVTE YP VQRNYGFQ IHTKEGEFTIiS AWTSG IRRN WIQT I 

MKHVHPTTAPpVTSSLPBEKNKSSCSFETCPRPTEKQEAELGEP 

DPEQKRSRARE \ RRREGRSKTFDWAEFR P IQQALAQERVGGVGP 

Aimi\DPWRPEAEHGELEKERARRREERRKRFGMLDATDGPGTE 

DAAUWEVDRSPGLPMSDr.KTHNVHVEIEQRWHQVETTPLREEK 

QVPIAPVHLSSEDGGDRLSTHELTSLLBKELEQSQKEASDLLBQ 

NRLU3DQLRVALGREQSAREGYVLQATCERGFAAMEETHQKK3 E 

DLQRQHQRELEKLREEKDRLLAEETAATISAIEAMKNAHREEME 

RELEKS QRS QI S3 VNSDVEALRRQ YLEEIiQS VQRELEVLS EQYS 

QKC^ENAHI^ALEAERQAI^QCQRENQE 

ITRLRTLLTGDGGGEATGSPLAQGKnAYELEVPSGARPCLTQLC 

TQEPQGSAAWPLSYRWGGTDLRQQESQQPGRSKSPEGGEEQ 


5725 


3 


104 9 


VWGHSEETSQSPNRTEPHDSDCSVDLGISKST3DLSPQKSGPVG 
SWKSHSITNMEIGGLKIYDILSDN\DLSSHLQPLK/FTSAVDG 
KNIVRSKAATLLYDQPLQVFTGSSSSSDLISGTKAIFKFDSNHN 
P E / G AKYNKRP HKWAHNLHLKYMVLHS 1 1 SNTVAV \ RSQRHFVA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H«Histidine, I»Iaoleucine, K«LyBine, 
L=Leucine, M-Methionine, NoAsparagine, 
P«Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyxosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQTKS PNRPCQFSSSAPS / VDQRAQ / INQSYAKHSANMNFSNHN 
NVRANTAYHLHQRLGPARHGEMWAI S PNDRLI PAVTRSTIQRQS 
S VS STAS VNLQD PGSTRRAQI P EGD YLS YREFHS AGRTPPMM PG 
SQRPLSARTYS IDGPNAS RPQSARPS INBIPERTMSVSDFNYSR 
TSP 


5726 


2 


486 


" SRSLSMWWNSGLPA^SHSSKLPVTVGFSGCVKRliRLHGRPLGAP 
TRMAGVTPCILGPLEAGLFFPGSGGVITL/ESVGAGIPGPSRAG 

! QGS PGGSGEG PPLSS PSQPLPADLPG ATL PDVGLELEVRPLAVT 
GLI PHLGQARTPPyLQLQVTEKQVLLRADDG 


5727 


21 


221 


RPILILKETRRLPWATGYAEVINAGKSTHNEDQASCEVLTVKKK 
AGAVTSTPNRNSS KRRSSLPNGE 


5728 


2 


877 


gtrngqfeprrgrAWeg'sagglrXpgaaaggpgVqprgsg/lpg" ' 
nairagvnpgrgpaspfwdlslpwdlwppptdhapgapdfpave 

GR\ PWAGGRPPWPVSGVLGSRVCGPLYSTSPAGPG/SGGLS psq 
GGPAGAGGDAG / LPGRCP S APWRAGSR P AAS CPDWI PGPQ9LWL 
HRNPTS/GPPSQIGEGAEQGDEGVADAPQIQCKN/GAEDPPAED 
EPPQVPEAGEEDAVPAEEGPGGTPETQADQVRERPEAHLAEGGA 
KGS PRRLADPQDL PAGQMSLAPP FP PVAAVI RSNX 


5729 


1 


1525 


AGGARE VLTLU liGHFAGFVGAHWWNQQDAALGRATDSKE" P PGEX# 
CPDVLYRTGRTLHGQETYTPRLILMDLKGSLSSLKEEGGLYRDK 
QLDAA IAWQG KLTTHKBEI* Y P KNP YLODFLS AEG VLSSDavwp v 
KS IPNGKGSSPLPTATTPKPLIPTEAS IRVWSDFLRVHLHPRSI 
CMIQKYNHDGEAGRL£AFGQGESVLKEPKYQBBLEDRLHFYVEE 
CDYLGGFQILCDLHDGFSGVGAKAAEHjQDEYSGRGIITWGLLP 
GPYHRGEAQRNIYRLJJ^AFGLVHLTAHSSLVCPLSIJGGSLGLR 
PEPPVS FPYItH YDATLP FHCS AILATALDTVTCS \ YRLCSS PVS 
MVHL\ADMLS FCGKKWTAGAI I PFPLAPGQSLPDSLMQFGGAT 
PWTPLSAC3GEPSGTRCFAQS WLRG IDRACHTSQLTPGTPP PSA 
LHACTTGEEII^YLG^O^PGVMSSSHLLLTPCRVAPPYPHLFS 
SCS PPGMVLDGSPKGAAVESVPVFG 


5730 


12S8 - 


1713 


KKFQAPARETCVECQKTVYPMERLIiANQQVFHl SCFRCS YCNtfK 
LSLGTYAS LHGRI YCKPHFNQLFKS KGN YDEGFGHRPHKDLWAT 
KIETEGFWERPRI7FENCGRPLKSPGGEDCPSC*GGCPGSNY*AQ * 
GSSSREKGGQASWNPKLRVA 


5731 


122 


443 


RSHRGELIPKDSCTMRKPPRRPtCKRRQG/CALPQGCLTFKDVAi" 
E FS LE E W KCIiNPAQRALYRAVMLEN YRNLEBVGLTS KDSWYMRK 
KPGRGRGKQRRQEWFFLRVY 


5732 


226 


772 


PPSRSCQSPRRKSRRRAHVT.VTLVCGFTSFSFSIiPLYLCGCLRF 
PERTCSQLQQADWAPDFGPSS FVPSWGATATGARXFLIAFNI \N 
I^TKEOAHRIAI^REQGRGKIX3PGRIiXKVQGIG^LDEKNl4A 
QVSTNLLDFE VTALHTVYE ETCREAQELS LPWG SQLVGLVPLX 
ALLDAA 


5733 


1 


460 


PALQEVNANALAWGKQYBNDARTLFBFTSGVNDTESPIIYRDES 
MRTACS PDGLCSDGNGLELKCPFTSRDFMKFRLGGFEAIKSAYM 
AQVQYSMWVTRKNAWFANYDPRMKREGLHYVVIERDEKYM\AS 
FOE I \ VP \ EFIGKMDE VLSRDPM 


5734 


3 


968 


RCNSPESLTSLLVLLTTANNLFVLIPAYSKNRAYAiFFIVFTVI 
GSLFLMWLLTAIIYSQFRGYLMKSLQTSLFRRRLGTRAAFBVLS 
SMVGEGGAFPG^VGVKPQNLWVLQKVQLDSSHKQAMMEKVRSY 
GSVLLSAEEFQKLFNELDRSVVKEHPPRPEYQSPFLQSAQFLFG 
HYYFDYliGNLIALANLVS I CVFLVLDADVLPAERDDFIIiGILNC 
VFIVYYIiLEMLLKVFALGLRGYLSYPS^FIX3LLTVVLLVLE I S 
TL \ VCTDCHTQAGG RRWW/RLLS L WDMTRM LNML I VFR FLR IIP 
SMKPMAWASTVLGL 


5735 * 


2" 


540 


FFTPCVARAFNFP DQATVKKAAYSLPRVGGGTS CGLPQARRI SL ' 
ATPRQLYK/SSNMTQRWQRREISNFEYLMFLNT1AGRTYNDLNQ 
YPVFPW VLTNYESEELDLTLPaNFRDLS KPIGALNPKRAVFYAE 
RYETWEDDQSPPYHYNTHYSTATSTLSWLVRIVSIFIEIiACLWY 
LKILT 


5736 


1 


382 


3TRPSTKKSGYSPQQVAVIHCKGHQKENTAVAHSNQKADSAAQV " 
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j SEQ 
ID 
| NO: 


Predicted " 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end - 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, o=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
HaHistidine, I=Isoleucine, K=»Lysine, 
I»= Leucine, M=Methionine, N«Asparagine, 
PaProline, Q=Glutamine, R=Arginine, 
S=Serine, T»Tbreonine , V^Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=StOp 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TARLSVTPPNLLPTVS FPQ PDLPbN PV YSTTTEKIAS DLRANKN 
QES * * I L PDSGI FI P * T*TS YLQSTTHLRRAKLPQLLRR 


5737 




1041 


KACLHLLSSPLTSNFLFNPLLPDSLYSVEARSQRANLGPCRHiCR 
LQTLMRLAAG PQYSSHKDPSLSAKEKKTDYHNEARGP WPGWVG * 
RTADGSCGRGPDGAHHPGPKSSSWRASRIiLPGLGGSHHLDAYVG 
RDLECGTP APLQLE I PPQPRGHPAP IPTGQAG PRDS G PGAS P * V 
ETR PLTDGRR * PGVRPVGWTPAHPAGTLRPRGAVEPSVSACGKW 
APSPTSQGCCBGRCDAVPKHRAWRTPLCSQ 


S738 


B 


4£0 


DTLS LNCTl»PETLPMTP S P* LS FL * FPGLARAKS I PTKT YSNE V 
VTLWYRPPDlLLGSTDYSTQIDMW*GOVEVW0GPCGKGGGI»VTT 
ATQPAAFLPTVPSLPRGVGCIFYEMATGRPLPPGSTVEEQLHFI 
FRI1SEEAWALCAVETHR 


S739 


1 


1222 


S FQRRG I RWNVHTLHPH PRAV WAGI GRGHGS * ALLGRARAPALC 
FPTLLEFLESLEPDLPALRAMGLHLWAAGPGTHPAGISDIiLAEV 
SAEVIX3PVPGYLSSPQSITDTCLYIFTSGTTGLPKAARISHLKI 
LQCQGFYQLCGVHQEDVIYLALPLYHMSGSLLGIVGCMGIGATV 
VLKSKFSAGQFWEDOQQHRVTVFQYIGELCRYLVNQPPSKAERG 
HK VRLAVGSGLRP DTWER F VRR FGPIiQ VLETYGLTEGNVATINY 
TGQRGAVGRAS WL YKH I FP FSL I RYDVTTGE P I RDFQGHCMATS 
PGEPGLLVAPVSGQSPFLGYAGGPELAQGKLLKDVFRPGDVFFN 
TRDLLVCDDQGFLRFHDRTGDP FRWKGENVATTEVAEV FEALDF 
LQEVNVYGVTV 


5740 


265 




PAYWLKVPTLCLESKTDLREKASHVSAQLQGEVRGLAGALWM*A 
YVYERV^*NISRMVHALEQKRHPAGLSSSMALQLNPCIiGMLMA 
LQS ELHKLYDEETQSWVS G SACGG YP 


5741 


1 


650 


PRKTMRRGVLMTLLQQSAMTLPLW2GKPGDRPPPLOGAI PASGD " 
YVAR PGDKVAARVKAVDGDEQW IZiAEWSYSHATNKYBVDDIDE 
EGKERHTLS RRR VI PLPQWKANPETDPEALFQKEQLVLALYPQT 
TCFYRAL IHAPPQ RPQDD YS VLFEDTS YADGYSP PLNVAQRYW 
ACKEPKKK*CRLADSPSPNDTGQDSRGRAGIKHIPPLKKK 


J 5742 


2 1 


362 

r 


TQSVKEILKRNPKVNLTDKIX3NTALMIASKEGHTEIVQDLLDAG 
TYVNI PDRSGDTVLIGAVRGGHVEIVRALLQKYADIDIRGQDNK 
TAli YWAVEKGNATMVRDI liQCNPDTEICTKDG 


5743 
' 5744 


2 j 


415 


GKTPEGIDAIEEIEIDLEETEREISPQMNGtBEVKPLGBMQTDL 
KATGREISPREKTPEVIDATEEXDKDLEETGRREISPEENGPBE 
VKPVDEMETDLKTTGREGSSREKTREVIDAABV1 ETDLEETE^E 
ISPQE 




3 


703 


TRRTTTTS PTTTRQMTTTPAALPTTVVTTPDLTTGTPIiQMrTI A ~ 
VFTTANTCLSLTPSTLPEEATGI/LTPEPSKEGPILTAESETVLP 
SDSWSSAESTSADTVUiTSKESKVWDLPSTSHVSMWKTSDSVSS 
PQPGAS DTAVPEQN KTTKTGQMDG I PMS MKNEMP ISQLLM I XAJP 
S IX3FVLFALFVAFLLRGKLMET YCSQKH*nUiDY I GDS KNVLND V 
QHGREDEDGLFTL 


5745 


1400 


599 


GKSRFVNLMKHSKKTYDS FQDELEDYI KVQKARGLEPKTCFRKM ' 

KGDYLETCX3YKGEVNSRPTYRMFDQRLPSETIQTYPRSCNIPQT 

VENRLPQWLPAHDSRLRIiDSLSYCQFTRDCFSEKPVPLNFNQQE 

YICGSHGVEHRVYKHFSSDNSTSTHQASHKGIHQKRKRHPBEGR 

EKSEEERSKHKRKKSCEEIDLDKHKSIQRKKTEVEIETVHVSTE 

KLKNRKEKKSRDWSKKEERKRTKKKKEQGQERTEEEMLWDQSI 
LGF 


5746 


3 


821 


SFASGRLTPSSPAFDGELDLQRYSNGPAVSAWSLGMGAVSWSES " 
iCAbbRRFPCPVCGKRFRFNS I LALHLRTHQ P ERP RS PAARLLLE 
LEERALLRE ARLGRAR5 5GGMQAT PATEGLARPQAPS S S AFRCP 
YCKGKFRTSAERERHLHILHRPWKCGLCS FGSSQEEELLHHSLT 
AHGAPERP LAATSAAP P PQPQPQP PPQPEPRS VPQPEPEPQ PER 
EATPTPAPAAPEEP PAP PEFRCQVCGQS FTQS WFLKGHMRKH KA 
SFDHACPV 


5747 


2 h 


1328 

] 


DRHVETLCIHFLGPSTGSTAKTGGRNWLKTGNCLYGNTCRFVHG ' 
PSPRGKGYSSNYRRSPERPTGDLRERIKNKRQDVDTEPQKRNTE 
BSSS PVR KES S RGRHREKED I KITKERTP ESEEENVEWETNRDD 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location . 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G^Glycine, 
K*Histidine, Idsoleucine, K=Lysine, 
L= Leucine, M=Methionine, NsAsparagine, 
P=Proline, Q=Glutamine, RaArginine, 
SaSerine, T*Threonine, V» Valine, 
W»Tryptophan, Y«Tyrosine, X=Unknown, *aStop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








SDNGDINyDyVHELSLEMKRQKIQRELMKLBQENMEKREEIIIK ' 

KEVSPBWRSKLSPSPSLRKSSKSPKRKSSPKSSSASKKDRKTS 

AVSSPLIiDQQftNSICmiQSKKKGPRTPSPPPPIPEDIALGKKYKE 

KyKVKDRIEEKTRDGKDRGRDFERQREKRDKPRSTSPAGQHHSP 

ISSRHHBSSSQSGSSIQRHSPSPRRKRTPSPSYQRTLTPPLRRS 

ASPYPSHSIiSS PQRKQS P PRHRS PMREKGRHDHERTSQSHDRRH 

ERREDTRGKRDREKDSREEREYEQDQSSSRDHRDDREPRDGRDR 

RE 


5748 


934 


473 


SEGPQVFYKGLAPTLI AI FP YAGLQFSCYSS LKHLYKWAI PAEG 
KKNENLQNLLCGSGAGVISKTLTYPLDLFKKRLQVGGPEHARAA 
FGQVRRYKGLMDCAKQVLQKEGALGFFKGLSPSLLKAALSTGFM 
FFSYEFFCNVFHCMNRTASQR 


5749 


552 


I 1 


GFPVDPRVRGSTIiSLAERPKGMI RSGS FRDPTDDVHGSVLSriAS' " 
SASSTYSSAEERMQSEQIRKLRRBLESSQEKVATLTSQLSAN3\N 
LVAAFEQSLVNMTSRLRHLAETAEEKDTELLDLRETIDFLKKKN 
SEAQAVICJGAtiNASETTPKELRIKRQNSSDS IS SLNSITSHSS I 
GSSKDADA 


S750 


22 


86* 


IF IS I CLWNAHLCFLLIiP kdcidq vmklqnlfvdds gr YLA I q F 
IILEWAYVFIiYYYE YRKAKDQLDIAKDISQLQI DLTGALGKRTRF 
QENYVAQLILDVRREGDVTUSNCEFTPAPTPOBHLTKNLEIiNDDT 
ILNDI KLADCEQFQMPDLCAEEXAI I LGICTNFQKNNPVHTLTE 
VELLAFTSCLLSQPKFWAIQTSALILRTKLEKGSTRRVBRAMRQ 
TQALADQPEDKTTSVLERDCI FYCCQVPPHWAIQRQLASLLFEL 
GCTSS ALQI FE KLEMWE 


5751 


3 


751 


scgsalrawrcgaaalat fp apalpglmyralyafrs ae pnaia 
faagetfiivlerssahwwlaarars getgyvfpaylrrdqgleq 
dvi^aidraieavhntamrdggkysleqrgvlqkl i hhrketi^s 
rrgpsassvavmtsstsdhhldaaaarqpngvcragferqhslp 
ssehlgadgglfqiplpssqippqprraapttppppvkrrdrea 
lmasgsgghntmpsggns vssgss VSS CI 


5752 


3 


471 

•r 


GPVCGVGLS VAWAG PWRGP VHS\/GG'GGRAALHGAE L PCLSGAAT " 

veremelrhknemlrvetbararakaerenadi ireqirlkase 

HRQTVIiES IRTAGTLFGEGFRAFVTDRDKVTATVNI FIKQGWQV 
AERQHVGASWS prscpcrlctal 


5753 


34 


483 


DDSXAIPGGVQAPFGAVRNIYTPRTGHRIRKLDQIQSGGNYVAG"" 

gqeafkklnyldigeikkrpmevvntbvkpvthsrinvsarfrk 
plqepcti fl iangdlinpas rll i prktlnqwdhvlqmvteki 

TuRSGAVHRLYTtiEGRLV 


5754 


14 


331 


TLVHVVEFAGEJIAEAXASREQEVI«0/3WKE^ 

ADALRFHSQVRDLLSWMDGIASQIGAADKPRCPSSLIjGIiPASPW 

WPTPATPSPLTAPFSME 


5755 


3 


888 


LGDQFY keaiehcrs ynsrlcaersvrlpfldsqtgvaqnnc Y I" 

WMEKRHRGPGLAPGQIiYTYPARCWRKKRRLHPPEDPKLRLLEIK 
PBVELPLKKDGFTSESTTLEALliRGEGVEKKVDAREEESIQEIQ 
RVLENDENVEEGNEEEDLEEDI PKRKNRTRGRARGSAGGRRRHD 
AASQEDHDKPWCDICGKRYKNRPGLSYHYAHTHIiASEEGDEAQ 
DQETRSPPNHRNEiraRPQKGPDGTVIPNNYaJFCI^GSKMNK^ 
GRPEEL VS CADCGRS AHLGGEGRKE KEAAA 


5756 


3 


621 


SSKIiC3AI*FAHPLYrn/PEEPPLIjGAEDSLLASQEALRYYRRKVAR 
WNRRHKMYREQMNLTSLDPPLQLRIjEASWVQFHLG inrhglysr 
SSPWSKLLQDMRHFPTISADYSQDEKALLGACDCTQIVKPSGV 
HLKLVLRFSDFGKAMFKPMRQQRDEETPVDFFYFIDFQRHNAE I 
AAFHLDR I LDFRR VPPTVGR2 VNVTKBI L 


5757 


3 


473 


YKDALLLPDNHRQWFENGTIiKLTDVQKGMDEGEYIiCSVLIQPQ" 
LS ISQSVHVAVKVPPLIQPFEFPPASIGQLLYI PCWSSGDMPI 
RITWRKDGQVI ISGSGVTIESKEFMSSLQ IS S VSLKHNGNYTCI 
ASNAAATVSRERQLIVRVPPRFW 


5758 


1 


474 


FRRGAGAERGEHREGERGAAGMGEFKVHRVRFFNYVPSGIRCVA 
YNNQSNRIAVSRTIX3TVEIYNLSANYFQEKFFPGHESRATEALC 
WAEGQRLFSAGLNGE IMEYDLQALNIKYAMDAFGGP I WSMAAS P 
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SEQ~ 
ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of. 
amino acid 

j sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine. C=Cvsfceine risAcna^ir <c> 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H«Histidine, I»Isoleucine, KsLysine, 
L=Leucine, M=rMethionine, N=Asparagine, 
P=Proline, Q«=Glut amine, RsArginine, 
S«Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y»Tyrosine, X-TJnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 


5759 


1 2 


1240 


SGSQLLVGCEDGSVKLFQITPDKI PV 

GNAAFAGQGV V YETFHMS DLP S YTTNGT VHVWNNQ IG FTTD PR 
MARS S P Y PTD VAR WNAP I FHVNADD P BAVI YVGS VAAE WRNT F 
NKD VGADLVCYRRRGHNEMDE PM FTQ PLMY KQ I HR Q VP VL KKYA 
DKL I AEGTVTLQEFEEE I AKYDR I CEEAYGRS KDKK I LH I KHWL 
DSPWPGFFNVDGEPKSMTCPATGIPEDMLTHIGSVASSVPLEDF 
KI HTG^RI LRGRADMTKNRTVD WALAE YMAFGSLLKEG I HVRL 
NGQDVERGTFSHR>^VLHDQEVDRRTCVPMNHLWPDQAPYTVCN 
SSLSEYGVLGFELGYAMASPNALVLWEAQFGDFHNTAQCIIDQF 
ISTGQAKWVRHNGIVLLLPHGMEGMGPEHSSARPERFLQM3NDD 
3 D AYPAFTKDF3VSQL 


576C 


1 


1221 


VRDITSDSliSLSWTVPEGQFDKFLVQFKNGDGQPKAVRVPGHED - 
GVTISGLEPDBaCYKMNLYGFHGGQRVGPVSAVGLTAPGKDEEWA 
PASTEPPTPEPPlKPRIiEELTVTDATPDSIiSLSWTVPEGQFDHF 
I*VQ YKNGDGQPKATRVPGHEDRVT I SGLE PDNKYKMNLYGFHGG 
QRVGPVSAIGVTAAEEETPTPTEPSMEAPBP PEEPLLGE LTVTG 
SSPDSLSLSWTVPQGRFDSFTVQYKDRDGRPQVVRVGGEESEVT 
VGGLEPGRKYKMHLYG LH EGRRVGPVSTVG VTAPQED VDETPS P 
TEPGTEAPEPPEE PLLOELTVTGS SPDSLS LS WTVPQGRFDS FT 

VQYKDRlXmPC^VRVGGQESKVTVRGbEPGRKYKMHLYGLHBGR 
RLG PVS AIGVT 


5761 


3 


1275 


SCDMAEAAALVW IRSPGFGCKAVRCASGRCTVRDy ilHRHCQDQ^ 
VPVENFFVKCNGALINTSDTVQHGAVYSLEPRLCGGKGGFGSML 
RALGAQ XBKTTNREACRDLSGRRLRDVNHEKAMAEWVKQQAERE 
AEKEQKRLERIjQRKLVEPKHCPTSPDYO^QCHEMAERIjEDSVLK 
GMQAAS S KMVSAE I SENR KRQWPTKS QTDRGAS AGKRRCFWIjGM 
EGLETAEGSNSESSDDDSEEAPSTSGMGFHAPKIGSNGVBMAAK 
FPSGSQRARVVNTDHGSPEQIjQIPVTDSGRHILEDSCAELGESK 

ehmesrmvteteetqekkaes kepi bee ptgaglnkdketeert 
dgervaevapeerenvavaklqesqpgnavidketidllaftsv 

AELELLGLEKIjKCELMALGLKCGGTLQ 


5762 


2 


344 


GSTGOTPLHSQGGGGGSGGGRRRTPRGMPKEKYEPPDPRRMYTr 
MSSEEAANGKKSHWAEI^ISGKVRSLSASLWSLTHLTALHLSDN 
S LSR I P SDI AKLHNLVYLDLS SNKIR 


5763 


3 


429 


I^KDTGliIMLIARLDYELIQRFTLTIIARDGGGEETTGRVRINV ~ 
LDVNDNVPTFQKDAYVGALRENE PSVTQL VRLRATDEDS ppnwq 

ITYSIVSASAFGSYFDISLYEGYGVISVSRPLDYEQISNGLIYL 
TVMAMDAGN 


5764 
57*5 4 


19 


441 


VCARACGEMRQLLRP IDRQRYDENEDLSDVEE I VSVRG FSLEE K ' 
LRSQLYQGDFVHAMEGKDFNYEYVQREALRVPLI FREKDGLG IK 

MPDPDPTVRDVXLLVGSRRLVDVMDVNTQKGT5MSMSQFVRYYE 
TPEAQRDKL 




3 


825 


QK I LRLNNSHQP PT5 SSNS KDCGG PASSGAGATAA LADGL KFAS 
VQAS APQGNS HKETS KS KVKRS KTS KDANKS LPS AAL YGI PEIS 
S7GKRQEVQGRPGEATGMNSALGQSVSSGGSGNPNSNSTSTSTS 
AATAGAGSCGKSKEEKPGKSQSSRGAKRDKDAGKSRKDKHDLLQ 
GHQNGSGSQAPSGGHLYGFGAXSNGGGAS PFHCGGTGSGS VAAA 
GEVS KSAPDSGLMGNSMLVKKEEEEEESHRRIKKIiKTEKVDPLF 
TVPAPPPHV 


576^ " 


1608 


663 


SGI^SVDPASSQAMBLSDVTLlEGVGNEVNWAGVVVLILALVL 
AWLSTYVADSGSNQLLGAIVSAGDTSVLHLGHVDHOVAGQGNPE 
e i ^Jj^nt'aiivj«iJttiw^i^u>EbKC»DSTGEAGAGGGVEPSLEHLLiD 
IQGLPKRQAGAGSSSPEAPLRSEDSTCLPPSPGLITVRLKFLND 
TB ELAVARP EDTVGALKS KYFP G QES QMKL I YQG RLLQD PARTL 
RSIiNITDNC^IHOIRSPPGSAVPGPSASLAPSATEPPSIiGVNVG 
SLMVPVFWLLGVVWYFRDTYRQFFTAPATVSLVGVTVFFS FhV 
FGMYGR 


5767 p 


2 


B92 


NFRATPRP PTRPELRTGTE VI LWYLDWRALMKRKRMKANI KLVG 
SGFPLPSSDLDDSLTEE IDEKIGFRNDANFDWQNVADFRDAGGS 
CTEVKVEEBERDPQSPEFEIEEEBEMLSSVIPDSRRENELPDFP 
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SKQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine # D=*Aspartic Acid, E» 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=»Lysine, 
L=Leucine, M=Methionine, N«Asparagine , 
* * w«vjx uL.cinij.n6 / K^Argxnxne, 
S=Serine, T=Threonine, V=»Valine, 
W=»Tryptophan, Y=Tyrosine, X a unknown, *=stop 
Codon, /=possible nucleotide deletion, 

\=DOSSible nUfflpnhiHo ^noarhinnl 








HIDE FFTLNS TPSRSAYDE PHLLVNIEKQKLELiEKRRLDI EAER 
LQVEKERLQIEKEPXRHI^MEHERLQLEKBRIiQIEREKLRLQIV 
NSEKPSLENELGQGEKSMLQPQDI ETEKLKLERERLQLEKDRLQ 
FLKF3SBKLQ I EKERLQVEKDRLRIQKEGHLQ 


5768 


3 


476 


AAMVAKDYPFYLTVKRANCS LELP PASG PAKDAEEPSNKRVKPL 
SRVTSIiANLIPPVKATPLKRFSQTLQRSISFRSESRPDILAPRP 
WSRNAAPSS TKRRDSKLWSETFDVC 


5769 


38 




* j\a A.ivjv2vc.iv/vjL LAja ViVftjrAiSHUfKljQYVGFMQCSVTSKGVIHL 
TKLRNLSSIJDLRHITELDNETAMEIV^ 

DRCVEVIAKEGQNLJCEIiYliVSCKITDYALIAIGRYSMTIETVDV 
GWCKEITDQGATLj^QSSKSLRYLGLMRCDKVNEVTVEQ1,VQQY 
PHI TFSTVLQDCKRTLERAYQMGWTPNMSAASS 


5770 


1 


484 


Ob RR 1 u v ATRKW S FLLEEHSKL I AKVRCLPQ VQ LDPL PTTLTLA 
FAS QLKKTS L SLT PDVPE ADLS E VDPKLVSNLMP FQRAGVNFA I 
AKGGRLLLADDMGiyjKTIQAICIAAFYRKEWPLLVWPSSVRFT 
WBQAFLRWLPSLSPDCINVWTGKDRLTA 


5771 


168 


741 


GLLPSACLRARSWREASEGPSSRACSNGSQm'KEACYSGTS^PS 
FHGSHCSGSDHS SI£LEQI£DYM\n%RSKLGPLE IQQPAMLLRE 
YRIiGLPIQDYCTGt^KLYGDRRKFLLLGMRPFIPDQDIGYFEGF 
LEGVG I REGG I LTDS PGR I KRSMS S TS ASAVRS YDGAAQRPEAQ 

ftPtlDT.T BrNTT»Ur\Tt? 
HrnKLiiAUXiHUlh 


5772 


148 


383 


EFNIiALVSPSHPQIKAEJSDQPLPGVLIiSl^GGLFRSNLLTQDNG 
ILTFSNIiVTCSAIYHLPVFPEREPGCSMRDLRVA 


5773 


2 


723 


PRVRSlviiNFCFMEMNTRljQVEHPVTEMITGTDLVEWQLRIAAJGE ' 
KIPLSQEEITLQGHAFEARIYAEDPSNNPMPVAGPLVHLSTPRA 
DPS TRIETGVRO^DEVS VHYDPM I AKLVVWAADRQAAIiTKLR YS 
IiRQYNI VGLHTNTDFLLNLSGHPE FEAGNVHTDFIPQHHKQLIiL 
SRKAAAKESLCOJUU^LILKEKAMTDTFTLQAKDQFSPFSSSSG 
RRLNISYTRNMTLKDGKNSK 


5774 


2 


592 


FVEEENIRWRCGGSELNFRRAVFSADSKYI FCVSGDFVKVYST 
VTEECVHILHGHRNLVTGIQLNPNNHLQLYSCS LDGTIKLWDYI 
DGILIKTFTVGCKLHALFTLAQAED3VFVTVNKEKPD1FQLVSV 
KLPKSS S Q R V EAKELS FVLDY INQS PKCIAFGNEGVYVAAVREF 
YLS VYFFKKETTSRVTLS SS 


5775 


3 


?Tn 


SSGCCJ3PAAPSSIiAEAATMPVSKCPKKSESLWKGWDRKAQRNGI# " 
RSOVYAVNGD YWGEWKDNVKHQKGTQVWKKKGAI YEGDWKFGK 
RIXTifGTLSLPDQQTGKCRRVYSGWWKGDKKSGYGIQFFGPKEYY 
EGDWCGSQRSGWGRMYYSNGDIYEGQWENDKPNGEGMLRLSQNP 
RP 


5776 


2 


484 


RLPQDCVCQNLS ESIX3TLCPS KGLLFVPPD IDRR.TVELRLGGNF 
IIHISRQDFAN^f^GLVDLTLSRNTISHIQPFSFJ^LESLR^IlHL 
DSNRLPSIX3EDTLRGLVNLQHLIVNNNQLGGIADEAFEDFLLTL 

CiUuULij iWiNJ-irlOr'AvUljKoUAWVQPSTS 


sin 


2 


949 


GQDPE PGQDL FQ PERE VDPS WGRGRBPRLGKLRFQN DHLS ViKQ 
VKKLEQALKDGSAGLDPQLPGTCYSPHCPPDKAEAGSTLPENLG 
GGSGSBVSQRVHPSDLEGRBPTPELVEDRKGSCRRPWDRSLBNV 
YRGSEGS PTKPFINPLPKPRRTFKHAGEGDKIX5KPG IG FRKEKR 
NLPPLPSLPPPPLPSSPPPSSVNRRLWTGRQKSSADHRKSYEFE 
DLWJSSSESSRVDWYAG/mxGLTRTLSEENVYEDIl^PPMKENP 
YEDI EUiGRCt/GKKCVLNFPAS PTSS I PDTLTKQS LS KPAFFRQ 
NSERRNV 


5778 


1 


1210 


QRRQSVS RLLLP V FLLEPPAE PGLEPP PEEEGGE PAG VAEE PGS' ~ 
GGPCWLQLE EVPGPGPLfGGGGPLRS PS S YSS DELS PGBPLTS P P 
WAPLGAPERPEKZJiNR VLERLAGG ATRDSAAS DILLDDI VLTHS 
LFLPTEKFliQELHQYFVRAGG^GPEGLGRKQACIJ^LLHFLDT 
YQGLLQBBBGAGH1 1KDLYLLIMKDBSLYQGLREDTLRLHQLVE 
TVELKIPEENQPPSKQVKPLFRHFRRIDSCLGTRVAFRGSDEIF 
CRVYMPDHS YVTIRSRIiSAS VQDI LGS VTEKLQYSEEPAGREDS 
LILVAVSSSGEKVLLQPTEDCVFTALGINSHLFACTRBSYEALV 
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SBQ 
ID 
NO: 


beginning 
nucleoh i rto 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Preaxctea end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
| sequence 


Amino acia segment containing signal peptide 
{A=Alanine, C=Cysteine, D«Aspartic Acid, E« 
Glutamic Acid, F«Pbenyl alanine, G-Olycine, 
HsHistidine, Iolsoleucine, K«Lysine, 
L«Leucine, M^Methioninc, NoAsparagine, 
P=Proline, Q«Glutamine, R=Arginine, 
S-Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








PLPfeEIQVSPGOTEIHRVB^EDVANHLTAFHWELgRCVHSLEFV 
DYVFHGE 


5779 


138 


1*71 


EAVQVLIKH3ADVNARDKNWQTPLHVAAANKAVKCABVIIPLLS 
S VNVS DRGG RTAIiHHAAIaNGHVEMVNLLLAKGANINATO KKD^ 
AliHWAAYMGHLDVVALLINHGAEVTCKDKKGYTPLHAAASNGQI 
NVVBWLLNLGVEXDEINVYGNTALHIACYNGQDAVVNELIDYGA 
NVNQPNNNGPTPLHPAAASTHGALCLBLLVNNGADVNIQSKIX3K 
SPLH^AVHGRFTRSQTLIQNGGEIDCVDKDGNTPLHVAARYGH 
RI*L2NTLITSGA0TAKCGIfISMFPLHIAALNAHSDCCRKLIiSSG 
QKYS I V3LFSNEHVLSAGFEI DTPDKFGRTCliHAAAAGGNVECI 
KLIiQS S GADFHKKDKCGRTPLHYAAANCHFHC Z ETLVTTGAN VN 
ETDDWGRTALHYAAASDMDRNKTI LGNAH DNSEELERARELKEK 
EATLCLSFLLQNDANP S I RDKEG YNS IHYAAAYGHRQCLELLLE 
RTNSG F2ESDSGATKS PLHIAVS EMP 


5780 


154 




QFFRVITCLPFKGPDYRLYKSEPELTTVAEVDBSNGEEKSEPVS " 
EIBTSVVKGSHFPVGVVPPIU^PTPESSTIASYVTLRK?KKMM 
DLRTBRPRSAVEQLCLAESTRJRMTVEEQMERIRRHQQACIiREK 
KK<3LNVIGASDQS PLQS PSNLRDNP 


5781 




941 


rgslgghpwrp^mraascksci^vsfvtgphqbraygg^gpggaF" 

PA PPVSGTCP PDL I YAPTPEKAEGGSQ KNHQPPPGERAAKRDGE 

QAPCRAGPTRKVAVAPRPPSCP*GPE\PGEEPRRPLDRSPPLGQ 

VgPHFTSQDAKSAEDEAPSRHLGKHQPRSAQVGSRLnALQGPKT 

QHSIHTVTCKSPRQKEDRSPKPPQAPKHPEBHGRQS\QAPPPLP 

VAPSRTCGGC*TWDPALLVSP/PQGDSTPE1,PAP\QQPTGGPSR 

CRQALPPQG*ROX)PRQRPR/ P^TOASRSHPAKAKGCQGPPKIRNY 
NIMD 


5782 


5176 


1237 

.- 


DRSMMSMAAOSYTDSYTOTYTEAYNiVpbtPP^EPPTMPPtPTEX - 

PPMTPPLPPEEPPEGPALPTEQSALTAENTWPTEVPSLPSBESV 

SQPEPPVSQSEISEPSAVPTDYSVSASDPSVLVSEAAVTVPEPP 

PEPESS 1 7LTPVESAWAE EHEWPERP VTCMVS ETPAMSAE PT 

VLASEPPVMSETAETFI)SMRASGHVASEV3TSLLVPAVTTPVliA 

ESILEPPAMAAPESSAMAVLESSAVTVLES5TVTVLESSTVTVL 

EPS WTVPEPPWAEPDYVTI PVPWSALBPSVPVLE PAVSVLQ ' 

PSMIVSEPSVSVQBSTVTVSEPAVTVSEQTOVIPTBVAIESTPM 

ILESSIMSSKVMKGINI^SGIX2NIiAPEIGMQEIALHSGEEPHAE 

EHLKGDFYESEHGINIDLNINNHUIAJCEMEHNTVCAAGrSPVGE 

IGEEKILPTSETKQRTVLDTYPGVSEADAGBTLSSTGPFALEPD 

ATG\TS KGI 2FTTASTLSLVNKYDVDLS LTTQDTEHDMLI S TS P 

SGGSEADIEGPLPAKDIHI^LPSNOTIiVSSDTNEPLPVKRD\DQ 

TLAALI \SIiKESSGGEKEVPP PS * REHIiPDSGFSANIEDINEAD 

LVRPVSSPRTWNVLPSPRAGIi\EGP\LLASDFGPVQNLYSSPW 

\S8MP\ERASGS\SSGEKGG\YEIFVKVKDTHEKSKKNKNRDKG 

EKEKKRDSSLRSRSKRSKSSEHKSRKLTSESRSRARKRSSKSKS 

HRSVffl'RSRSRS/RDRRRRSSRSRSKSRGRRSVSKEICRKRSPKH 

RSKSRERKRKRSSSRDNRKTVRARSRTPSRRSRSHTPSRRRRSR 

S VGRRRS FS X S PSRRS RTPSRRSRTPSRRS RTPSRRSRTPSRRS 

RTPSRRSRTPSRRRRSRSWRRRSFSISPVRLRRSRTPLRRRFS 

RSPIRRKRSRSSERGRSPKRLTDLDKAQLLEIAKANAAAMCAKA 

GVPLP PNLKPAP PPTI EEKVAKKS GGATI EEIiTEKCKQ I AQS KE 

DDD VI VN KPHVS DEEEEEPPFYHHPFKLS EPKP I FFKLKIAAAK 

PTPPKSQVTLTKEFPVSSGSQHRKKEADSVYGEWVPVEKNGEEN 
KDDDNVFSSNLPSEPVDISTAMSERATjaniCRT.CPWBoriT t>&Mcu 
LNRAQE R I DAWAQLNS I PGQFTG S TGVQVLTQ BQ LANTG AQ A W I 
KKDQFLRAAP VTGGMGAVLMRKNGWREGEGLGKNKEGNKE PILV 
DFKTDRKGLVAVGERAQKRSGNFSAAMKDLSGKHPVSALMEICN 
KRRWQPPEFLLVHDSGPDHRKHFLFRVLINGSAYQPNCMFFLNR 
Y 


5783 


1693 


698 

1 


dsgLrvaftmeg isnfktps klsekxKs v~lcstptini pas pfm 

QKLGFGTGWNVYLMKRS PRGI^HS P WAVKKINP I CNDH YRS VYQ 
KRI^EAKILKSLHHPNTVGYRAFTEANIXSSLCIJWEYOGEKSL 
TOLIEE/PI*SQ/PKXLFX2QP/LILKVALNMARGLKYLHQEKKI J 



370 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


1 Predicted 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, Cs Cysteine, DeAspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G*Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
ijaijeuczne, M»necnionin.e, N-Aeparagine, 
P«Proline, Q«Glutamine, R=Arginine, 
S=»Serine, T=Threonine, VaValine, 
WoTryptophan, Y=Tyrosine, XoUnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LHGDIKSStfWIKGDFETlKICDVGVStPLPSNMTVTDPEACYl' w ~ 
GTEPWKP KEAVEENGVI TD KADI FAFGLTLWEMMTLS I PHINLS 
NDDDDED KT FDES DFDDEAYYAALGTRPP INMEELDES YQKVIE 
LFSVCTNEDPKDRPSAAHIVEALETDV 


5784 


2?sg 


1388 


PR VRPR VRTDHNY YI SRI YG PS DSASR DL WVNIDQME KDKVK IH 
GILSNTHRQAAR VNLS FDF P FYGH FLRE I TVATGG FI YTGEWH 
RMLTATQ Y I APLMANFDPSVSRNS TVRYFDNGTALWQWDHVHI* 
QDNYNLGS FTFQATLLMDGRI I FGYKB I PVLVTQ ISS TNHPVKV 
G LS DAFVWHRI QQ I PNVRRR7 1 YEYHR VELQMSK I TN I S AVEM 
TPLPTCLQFNRCGPCVSSQIGFNCSWCSKLQRCSSGFDRHRQDW 
VDSGCFBESKEKMCENTEPVET\FLEPPQP*2RQPPSSGS*LPP 
E/DAVTSQFPTSLPTEDDTKIALHLKDNGASTDDSAAEKKGGTL 
HAGLI VGI LILVL I VATAIL VTVYMYHH PTSAASI FFI ERRPSR 
WPAM KFRRGSGH PAYAEVEPVGE KEGF IVSEQC 


5765 


oe o " 


1388 


PRVRPRVRrDHNYYISRIYGPSDSASRDLWVHIDQMEKDKVKIH" 
GI LSNTHRQAARVNLS FDFPFYGHFIjRBI TVATGGFI YTGEWH 
RMLTATQ YIAP LMANFD P 3 VSRNS TVR YFDNGTAL WQWDHVHL 
QDNYNLGS FTFQATLLMDGRI I FGYKE I PVLVTQISS TNHPVKV 
GLSDAFVWHRIQQ I PNVRRRTI YEYHRVELQMSKI TNI S AVEM 
TPLPTCLQFNRCGPCVSSQIGFNCSWCSKliQRCSSGFDRHRQDW 
VDSGCPEESKBKMCBNTEPVET\FLEPPQP+ERQPPSSGS*LPP 
E/DAVTSQFPTSLPTEDDTKIALHLKDNGASTDDSAAEKKGGTL 
KAG L I VG I LI LVT*IVATAILVTVYMYHHPTS AAS IFFI ERRPSR 
W PAMK FRRGSGHPAYAEVE P VGE KEG FIVSEQC 


5786 


2532 


1674 


S YKLPAAERRASSCSQ PPTPTRRRWPAPGRTSRGHRPQM * SGTP 
APRPPARSTVSPASPLPKPRAGRCGSRPRSACSTPRPC*SLN*M 
S * H * KRNLSQRS SSM SRRP LSCARPHR* * RQGLTVAARL PTW AK 
SPPLAC5FCQAAQKSQSLSSGRSTR*PERMSFRP\SPPGNPAIP 
SLAPSSRP/ PKGRPQCTWI PSRWPASPTAPPTTT* APTS S PGST 
GRSMMTCPTRWTATPWSARASSRPRNWPTP*VmPSGRLSTV*RA 
TGGSTATAPPKRFPRNWNPMMAE 


5787 


2 


1460 


MASAASVTSLADEVNCP \ ICQGTLKEAGSLSNCG/HKNFCRACIT" 
T\RYCEIP\GPD\LEESP\TCP\LCXEPFRP\GSFRPNWQLANV ' 
VENIERLQLVSTLGLGEEDVCQEHGEKIYFFCEDDEMQLCWCR 
EAGEHATHTMR FLEDAA\APYREQIHKCLKCI» I XBREB I QEIQS 
RENKRMQVLLTQVSTKRQQVTSEFAHLRKFLEEQQSILLAQL2S 
QDGDI LRQRDE FDLLVAGEI CRFSALIEELEEKNERPARELLTD 
IRSTL I RCETR KCRKPVAVS PELGQRIRDFPQQALPLQREMKMF 
LEKLCFELDYE PAHISLDPQTSHPKLLLSEDHQRAQF9 YKWQNS 
PDNPQRFDRATCVLAHTG I TGGRHT>7VVS IDLAHGGSCTVG VVS 
2 D VQ R KG ELRLRP E EGVWAVRLAWG FVS ALGS FP \ TRLTL KEQ P 

RQVRVSLDYBVGWVTFTNAVTREPIYTFTASFTRKVTPFFGLWG 
RGSSFSLSS 


5788 


2 


6860 


EHSVSGRSSAYGDATAkGHPAGPGSVsSST^ISTtTGHQEGDG " 

SEGEGEGETEGDVHTSNRLHMVRIjMI^ERLLQTLPQLRNVGGVR 

AIPYMQVr^LTTDLDGEDEKDKOALDNLLSQLIAEIiGMDKKDV 

SKKNBRSAI^EVHLVVMRLLSVFMSRTKSGSKSSICESSSLISS 

ATAAALLSSGAVDYCLHVLKSL.LEYWKSQQNDEEPVATSQLLKP 

HTTSSPPDMSPFFLRQYVKGHAADVFBAYTQLLTEMVLRLPYQI 

KKITDTNSRIPPPVFDHSWFYFLSEYLMIQQTPFVRRQVRKLLL 

FIOGSKEKYRQLRDLHTLDS \H VRGIKKLLEEQG I FLRASWTA 

SPOSALOYDTLISLMEHT^KAr^ARTaflnPTTWMrjvpCTVTjnox/T v 

FLLQVS FLVDEGVSPVLLQLLS CALCGSKVLRALAASSGSSSAS 

SSPAPVAASSGQATTQSKSSTKKSKKEEKEKEKDGETSGSQBDQ 

LCTALVNQIiNKFADKETLIQFLRCFLLESNSSSVRWQAHCLTLH 

IYRNSSKSQQEIJ*LDLMWSIWPELPAYGRKAAQFVDLLGYFSLK 

TPQTEKKLKEYSQKAVEILRTQNHILTNHPNSNI YNTLSGLVEF 

DGYYLESDPCLVCNNPEVPFCYI KLSS I KVDTRYTTTQ QWKL I 

GSHTISKVTVKIGDLKRTKMVRTINLYYNNRTVQAXVELKNKPA 

R WKKAKKVQLTPGQTE VK IDLPLP I VASNLMI EFADFYENYQAS 

TETLQCPRCS AS VPANPGVCGNCGENVYQCHKCRS IN YDEKDP F 
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ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(Alanine, C«Cysteine, D=Aspartic Acid, B« 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P=Proline, Q=Glut amine , R=Arginine, 
SsSerine, ^Threonine, V«Valine, 
W=Tryptophan, Y-Tyrosine, X- Unknown, *=Stop 
Codon, Apoasible nucleotide deletion, 
\ -possible nucleotide insertion) 








IiCNACGFCKYARFDFMLYAKPCCAVDPIENEEDRKKAVSNINTL " 
LDKADJR VYHQLMGHRPQLENLLCKVNEAAPEKPQDDSGTAGG IS 
STSASVNRYILQLAQEYOGDCKNSFDELSKIIQKVFASRKEI.LE 
YDLQQREAATKSSRTSVQPTPTASQYRALSVLGCGHTSSTKCYG 
CAS AVTEHC ITLLRAIATNPALRHILVS QGLI RELFDYNLRRGA 
AAMREE VRQLMCLLTRDNPEATQQMNDL I IGXVSTALKGHWANP 
DLASSI^YEMLU.TDSISKEDSCWELRLRCALSLFLMAVNIKTP 
VVVENITIiMCLRILQKLIKPPAPTSKKNKDVPVEALTTVKPYCN 
E IHAQAQIj WLKRD P KAS YDAWKKCLP IRG I DGNGKA PS KS ELRH 
LYLTBKYVWRWKQFLS RRGXRTS PLDLKLGHNNWLRQVLFTRAT 
QAARQAACTI VEALATI PSRKQQVLDLLTS YLDELS IAGECAAE 
YIiALYQKL ITS AHWKVYLAARGVLP YVGNL ITKE X ARLLALEEA 
TLSTDLQQGYALKSLTGLLSSFVEVES I KRHFKSRLVGTVLNGY 
LCLRKL WQRTKL I DETQDMLLE MLBDMTTGTESETKAFMAVCI 
BTAKRYNLDD YRTP VF I FERLCS 1 1 YPBBNEVTEFFVTLEKD PQ 
QEDFIiQGRMPGNPYSSNEPGIGPLMRDIKNKICQDCDLVALLED 
DSGMELLVNNKI ISLDLPVAEVYKKVWCTTWEGEPMRI VYRMRG 
IiIiGDATEEFIESLDSTTDEEEDEBEVYKMAGVMAQCGGLECMIiN 
RIAGI RDFKQGRHLLTVLLKLFS YCVKVKVNRQQLVKLEMNTIiN 
VMLGTLNLAL VAEQE S KDS GGAAVAEQVLS 1 ME I \ I QAE PNVEP 
LSEDKGNLLLTGDKDQLVMLIiDQlNSTFVRSNPSVLCGLLRIIP 
YLSFGEVEKMQILVERFKPYCNFDKYDEDHSGDDKVFL\DCFCK 
IAAG I K\NNSNGHQL\ KDL \ XLQKG I TQNALD \ YMKKHI P/ SAA 
RIWDADlXWKSFCLRPALPFIIiRLLRGLAIQHPGTQVLIGTDSI 
PNLHKLEQVS \ S DEG IGTLA\ ENL\ LESLREH PDVNKKI DA\ AR 
RETRAEKKRMAMAMRQKALGTLG \ MTTNEKGQWD/TRTALLE A 
DWEELI EEP \GLTCX:iCREGYKFQPTKVLGI YTFTKRWLGGVW 
ENKPRETSRATSTVSHFNI VHYDC \HLA\AVSLARGREEWESAA 
LQNANTKCNGLLPVWGPHVPESAPATCIARHNTYLQECTGQREP 
TYQLNIHDIKLLFLRFAMEQSFSADTGGGGRBSNIHLIPYI IHT 
GLYVLNTTRATSREEKNLQGFLEQPKEKWVESAFEVDGPYYFTV 
LALH I L PPEQW RATRVE I LRRLLVTS QARAVAPGGATRLTD KAV 
KDYSAYRSSLLFWALVDLIYNMFKKVPTSNTEGGWSCSLAEYIR r 
HNDMPIYEAADKALiCTFQHEFMPVETFSEFLDVAGLLSEITDPE 
SFLKDLLNSVP . . 


5789 


1 


2407 1 


I^LHAVEKTGRPGQPALKMPGKLRSDAGbESDTAMKKGETLRKQ 
TBEKEKKE KPKSDKTEE IAEEE ETVFPKAKQVKKKAE PSEVDMN 
S PKS KKAKK\ KE B PS QKDI S PKTKSLRKKKEPI BKKVVS SKTKK 
VTKNEEPSEEEIDAPKPKKMKKEKEMNGETREKSPKLKNGFPHP 
EPDCNPSEAASEESNSEIEQEIPVEQKEG\AFSNFPISEETIKL 
LKGRGVTFLFP I QAKTFHHVYS G KDL I AQARTGTGKTFSFAI PL 
I BKLHG \ ELQDRKRGRAPQ VLVLAPTRE LANQVS KDFS DITKKL 
SVACFYGGTPYGGQFERMRNGIDILVGTPGRIKDHIQNGKLDLT 
iO^NPIVVLDEVDQMLDMGFADQVEEILSVAYFCKDSEDNPQTLLFS 
ATCPHWFNV7UCKYMKSTYEQVDLIGXKTQKTAITVEHLAIKCH 
WTQRAAVIGDVIRVYSGHQGRTIIFCETKKEAQELSQNSAIKQD 
AQSLKGD I P QKQREX TLKGFRNGS FGVLVATNVAARGLD I PE VD 
LVIQSSP PKDVES YI HRSGRTGRAGRTGVC I CFYQHKEE YQLVQ 
VEQKAGIKFKRIGVPSATEIIKASSKDAIRLLDSVPPTAISHFK 
QSAEKLIEEMSAVEAIAAALAHISGATSVDQRSLINSNVGFVTM 
ILQCS IEMPNIS YAWI03LKEQLGEEIDS KVKGMVFLKGKLGVCF 
DVPTASVTEIQEKWHDSRRWQLSVATEQPELEGPREGYGGFRGQ 
REGSRGFRGQRDGNRRFRGQREGSRGPRGQRSGGGNKSNRSQNK 
GQKRSFSKAFGQ 


5790 


3786 


1585 


ARRQRDP I^iAIjRJUUiQELKXWVDSLLS ES QLKEALEPNKRQH I Y 1 "' 
QRCIQLKQAIDENKNALQKLSKADE5APVANYNQRKEEEHTLLD 
KLTQQLQGLAVTISRENITEVGAPTEEEBESBSEDSEDSGGEEE 
DAEEEEEE KEENESHKWSTGEE Y I AVGDFTAQQ VGDLTFKKGE I 
LLVIEKKPDGWWIAKDAKGNBGLVPRTYLEPYSEEEEGQESSEE 
GS EE DVE AVDETADGAEVK\QRTDPH WSAVQ KAI SEAG I FCLVN 
HVSFCYLIVLMRNRMETVEDTNGSETGFRAWNVQSRGRI FLVSK 
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ID 

NO: 


| Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


'""**' v " v ** u ocymcnc containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=>Proline, Q=Glutamine, R-Arginine, 
S=Serine, T-Threonine, V»Valine, 
W.Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PVLQQ INT VD VLTTMGAIPAGPR P <3Tt7<?Vt ,T.fe»Fn wn pd akW en" r! — 
PELMPSQLAKRDLM WDATEGTIRSRPS R I SLILTLWS C KM I PhP 
GMSIQVLSRHVRLCLFDGNKVLSNIHTVRATWQPKKPKTWTFS P 
QVTRILP CLLDGD CFI RSNS AS PDLG IL PBLGI S Y I RNSTGBRG 
BLS CG WVFL KLFDAS GVP I PAKT YELFLNGGTP YE KG I E VDP S I 
SRRAHG3VFYQIMTMRRQPQLLVKLRSI.NRRSRNVLSLLPETLI 

gnmcsihllifyrqilgdvllkdrmslqstdlishpmuatfpm1, 
leqpdvmdalrsswagqes\tlkrsekr\pksflkvprfllvyh 
Xgcvlpll/htptrlppfrwaeeetetarwkvitdflkqnqenq 
galqallspdgvhepfdlseqtydfiigemrkkav 


5791 


3 


1636 


LRVAEPAGTSR/ IGAGLIQPLHRAPARDHGIijRGGAAPAIjSVSH 
GN/GKQL/AMSSQGSDDEQIKRENIRSLTMSGHVGFESLPDQLV 
NRS IQC^FCFNILCVGETGIGKSTLIDTIiFNTNPED YESSHFCP 
a v iujKAUU i eiiQESNVQLKLTI VNTVGFGDQINKBESYQPIVDY 
I DAQF EAYLQEE LKI KRSLFTYHDSRIHVCLYFIS PTGHSLKTL 
DLLTM KNLDS KVYI I P VI AKADTVS KTEI»QKFKIKI*MS ELVSNG 
VQI YQFPTDDDTIAKVNAAMNGQLP FAVVGSMDBVKVGNKMVKA 
RQYPWGWQVENENHCD FVKLREML I CTNMED LREQTHTRH YEL 
YRR CKLEEMG FTDVG PENKP VS VQET YEAKRHEFHG BRQRKEE B 
MKQMFVQRVKE KEAI L KEAERELQAXFEHLKRLHQEERMXIjEEK 
RRLLEEEI IAFSKKKATSEI FHSQSFLATGSNLRKDKDRKNSQF 
PVKQKVPEHRRSSSQANFIKKKLEVCFDFAVICFITSIFGEQPQ 
LLIFMEKYFQVQGQYISQSE 


5792 


2263 


653 


AAAAPSPAWWL^VFVVYVVHTCWVMYGIVYTRPCSGDASCIQPY^ 
IARRPKLQL\RHS FTTTRSHLGAENNI DLVLNVE DFDVESKFER 
TVNVSVPKKTRNNGTLYAYIFLHHAGVLPWHDGKQVHLVSPLTT 
xn v ^^iEIMIjIjTGBSDTQQIEADKKPTSALDEPVSHWRPRIjAIi 
NVMADNFVPDGSSLPADVHRYKKMI QU3KTVHYL PILF IDQLSN 
RVKDLMVINRSTTELPLTVSYDKVS LGRLRFWIHMQDAVYSLQQ 
FGFSEKIW)EVKGIFVDTI^YFIiALTFFVAAFHLLFDFLAFKND 
IS FWKKKKSMIGMSTKAVLWRCFS T WI PL FLLDEQTSLLVLVP 
AGVGAAIELWKVKKALKMTI FWRGLMPEFQFGTYS ESERKTEEY 
DTQAMKYLSYLL YPLCVGGAVYSLLNI KYKS WYSWLINSFVNGV * 
YAFGPLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 
I ITMPTSHRLACFRDDVVFLVxX YQRWLYPVDKRRVNE FOES YE 
EKATRAPHTD 


5793 


2263 


653 


AAAAPSPAWWCGVFVVY^TCWVMYGIVYTRPCSGDASCIQPY 1 "" 
IARRPKLOL\RHSFTTTRSHLGAENNIDLVLNVEDFDVESKFER 
TVNVS VP KKTRNNGTLYAYI FLHHAG VLPWHDGKQVHLVS PLTT 

YMVP1TPRP T MTJ .TCI? QTlTW^ TOR n W nrpe ji t nDniinitt,mnnT •* - 

j n v rxurfiCi xiMxiii ibcsLi ryy I bJUJKK PTS AuDEP VS H WRPRLAL 
NVMADNFVFMSSLPADVHRYMKMIQLGKTVHYLPILFIDQLSN 
RVKDLMVINRSTTELPLTVS YDKVS LGRLRFW I HMQDAVYSLQ Q 
FGFS EKDADEVKG I FVDTNL YF LALTF FVAAFKLL FD FLAFKN D 
ISFWKKKKSMIGMSTKAVLWRCFSTVVI FLFLLDEQTSLLVLVP 
AGVGAAI EIiWKVKKALKMTI FWRGliM PE FQFGT YSES ERKTEE Y 
DTQAMKYLS YIiLYPLCVGGAVYSLLNIKYKSWYS WLINSFVNG V 
YAPG PL FMLPQLFVNYKIJCS VAHLPWKAFT YKAFNTF IDDVFAF 
IITMPTSHRIACFRDD\n^VYLYQRWLYPVDKRRVNEFGESYE 
EKATRAPHTD 


5794 


1 


5016 


MGPRI^VVnjLI^PAALLLHEEHSRAAAKGGCAGSGCGKCDCHGV " 
KGQKGBRGLPGLOGyTGFPGMOGPEGPOGPPGQKGDTGEPGLPG 
TKGTRG P PGASG YPGNPGLPG I PGQDG P PG P PG I PGCNGTKGER 
GPLGPPGLPGFAGNPGPPGLPGMKGDPGEILGHVPGMLLKGERG 
FPGIPGTPGPPGLPGLQ3PVGPPGFTGPPGPPGPPGPPGEKGQM 
GLSFQGPKGDKGDQGVSGPPGVPGQAQVQEKGDFATKGEKGQKG 
EPGFXySMPGVGEKGEPGKPGPRGKPGKDGDKGBKGS PGFPGEPG 
YPGLIGRQGP\QGEKGBAGPPGPPGrVIGTGPLGEKGERGYPGT 
P GPRGEPG PKGF PGL PGQPG P PGLP VPGQ AGAPGFPGERGEKGD 
RGFPGTS LPGPSGRDGI*PGPPGS PGPPGQ PG YTNG I VECQ PGP P 
3DQGPPGIPGQPGFIGEIGEKGQKGESCt.ICDIDGYRGPPGPQG 
PPGEIGFPGQPGAKGDRGLPGRDGVAGVPGPQGTPGLIGQPGAK | 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


** w ^** n«s*tu uuuLainjiiy signaj. peptide 
(A=Alanine, C=Cysteine, D»Aspartic Acid, 
Glutamic Acid, FnPhenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lyaine, 
LaLeucine, M=Methionine, N-Asparagine", 
P-Proline, Q=Glut amine, R=Arginine, 
SaSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyro sine, x=Unknown, *«Stop 
Codon, /cpossible nucleotide deletion, 
\opossible nucleotide insertion) 








GEPGEFYFDIJiLKGDKGDPGFPGQPGMPGRAGSPGRDGHPGLPG 
PKGSPGSVGLKGERGPPGGVGFPGSRGDTGPPGPPGYGPAGPIG 
DKGQAGFPGGPGSPGLPGPKGEPGKTVPLPGPPGAEGLPGSPGF 
PGPQGDRGFPGTPGR\PGL\PGEKGAVG\QPGIGFPGPPGPKGV 
DGLPGDMGPPGTPGRPGFNGLPGNPGVQGQKGEPGVGLPGUCGL 
PGLPGIPGTPGEKGSIGVPGVPGEHGAIGPPGIiQGIRGEPGPPG 
LPGSVGSPGVPGIGPPGARGPPGGQGPPGLSGPPGIKGEKGFPG 
FPGLDMPGPKGDKGAQGLPGITCQSGLPGLPGQQGAPGIPGFK3 
SKGBMGVMGTPGQPGS PGPWGAPGLPGEKGD\HGFPGS SGPRGD 
P3LKGDKGDVGLPGKPGSMDKVYKGSMKGQKGDQGEKGQIGPIG 
EKGSRGDPGTPGVPGKDGQAGQPGQPGPKGDPGISGTPGAPGLP 
GPKGSVGGMGLPGTPGEKG VPG I PG PQGS PGLPGDKGAKG E KGQ 
AGPPG IGIPGLRGEKGDQG I AGFPGS PGEKGBKGS IGI PGMPGS 
PGLKGSPGSVGYPGSPGLPGEKGDKGLPGLDGI PGVKGEAGLPG 
TPGPTGPAGQKGEPGSDGIPGSAGEKGEPGLPGRGFPGFPGAKG 
DKGS KGEVG FPGLAGSPG I PGSKGEQG FMGPPGPQGQPGLPGS P 
GHATBGPKGDRGPQGQPGLPGLPGPMGPPGLPGIDGVKGDKGNP 
\3 wrwu'to v t\*if luMjna * Qun PG1GGS PG ITGS KG DMG P PG VPGF 
QGPKGLPGLQGIKGDQQDQGVPGAKGLPGPPGPPGPYDIIKGEP 
GLPGPEGPPGLKGLQGLPGPKGQQGVTGLVGIPGPPGIPGFDGA 
PGQKGEMGPAGPTGPRGFPGPPGPDGLPGSMG P PGTPS VDHGFL 
VTRHSQTIDDPQCPSGTKILYHGYSLLYVQGNERAHGQDLGTAG 
SCLRKFSTMPFLFCN I NNVCNFASRND YS YWLS TPEPMPMS MAP 
l7r «™-i*vrT iaKi^v^iayvPAwvMAVHSQTIQIPPCPSGWSSLWI 
GYS FVMHTSAGAEGSGQALAS PGSCLE E FRSAP F IECHGRGTCN 
YYANAYS FWLATIERSEMFKKPTPSTLKAGELRTHVSRCQVCMR 
RT 


579S 


1192 


61 

r 


STRSPTVE YISAH PH IIiFMLLKGYEAPQ I AltRCG I MLRE CIRRE 
PLAKI ILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVT, 
VADFLEQNYDTI FEDYEKLLQS2NYVT KRQSLKLLGEL I LDRKN 
FAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVA5PH 
KTQPlVElIiLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQI 


579* 


2 


1078 


GRVGWEIiWCMYISPPKDMWDAGDPSLPIRTPAMIGCSFVVNRkF 
FGEIGLIiDPGMDVYGGENlELGIKVWLCGGSMEVLPCSRVAHIE 
RKKKPYNSNIGFYTKRNAIiRVAEVWMDDYKSHVYIAWNLPIiEKP 
G I DI GDVSERRAIiRKSLKCKNPQ WYLDHVYPEMRR YNNTVAYGE 
LRNNKAKDVCLDQG PLENHTAILYPCHGWG PQLAR YTKEGFIiHL 

GALGTTTLLPDTRCLVDNS KSRL PQLLDCDKVKSSL YKRWNFIQ 

NGAIMNKGTGRnTjPVRMDrsT.Jir'TnT rr ncpip/^ABuniTmTart/^n 
y CM ^^xu^±UuLljRoG^GQHWriaN3XK.*R 

EGAGAI^PGPQDMAAPPNIWTSCPGGETARGRQVLDGPPRASPG 
QHRDPG 


5797 


2 


891 


PRVRQKTL VD VTLENSN I KDQI RNJjQQT YEAS MD KLRE KQRQLE " 

VAQVENOIlI*KMKVBSSOStAMfl'l7UMT3BTWrPVVT VCnVDPVT nnnnrt 

BCHSAEKEAIiLEETNSFIiKAIEBANKKMQAAEISIjEEKDQRIGEL 
DRLIERMBKERHQLQLQLLEHETEMSGELTDSDKERYQQLEEAS 
ASLRERIRKUXDMVHCQQKKVKQMVEEIESLK30a^QKQIjLILQ 
LLEKISFLEGENNELQSRLDYLTETQAKTEVETREIGVGCDLLP 
SQTGRTREIVMPSRNYTPYTRVLELTMKKTLT 


5798 


644 


115 


KIIiOSRWKSMSKQEKQP YYEEQARLS KIHLBKYPNYKYKPRPKR 
TCIVDGKKLRIG3YKQLMRSRRQBMRQFFTVGQQPQIPITTGTG 
VnryPGAITMATTTPSPQMTSDCSSTSASPEPSLPVIQSTYGMKr 
DGGS kAGNEM I NGE DEMEM YDDYE DDP KS DYS S ENEAPEAVS AN 


5799 


2*79 


1435 

] 

J 


LIiSTYIKFINLFPBTKATIQGVLRAGSQLRNADVELQQRAVEYIj " 

tls s vas tdvlatvleempp fperes s i laklkrxkgpgagsal 
ddgrrdpssndinggmb ptpstvstps ps adllglraapppaap 
pasagagnllvdvfdgpaaq pslgptpeeaflspgpedigpp i p 
eadbllwkfvcknng^feitqllqigvxsefrqnlgrmylfygn 
ktsvqfqnfs ptwhpgdlqtqlavqtkr vaaqvbggaqvqqvl 
jiieclrdfltppllsvrfryggapqaltlklpvtinkffqptem 
^dffqrtoqlslpqqeaqkifkanhpmdaevtkakllgfgsa 
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SEQ 
ID 
NO * 


predicted 
beginning 
nucAcot j.ae 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to firot 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, CnCysteine, D=Aspartic Acidf, E» 
Glutamic Acid, FePhenylalanine, G«Glycine, 
H»Histidine, I«Isoleucine, K-Lysine, 
L»Leucinc, M=Methionine, N»Asparagine , 
P^Proline, Q=Glutamine, R=Arginine, 
SnSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LLDNVDPNPENFVGAGIIQTKAI/2VGCLLRLEPNAQAQMYRLTL 
RTS KEPVSRHLCELLAQQF 


5600 


2679 


1435 


LLSTYIKFINLPPETKAriQGVIjRAGSQLRNADVELQQRAVEyL 

TLSSVASTDVLATVLE3MPPFPERESS ILAKLKRKKGPGAGSAL 

DDGRRDPSSNDINGGMEPTPSTVSTPS PSADLLGLRAAPPPAAP 

PASAGAGNLLVDVFDGPAAQPSLGPTPEEAFLSPGPEDIGPPIP 

EADELLNKFVCKNNGVLFE1K}LI£IGVKSEFRQOTX3^ 

KTS VQFQNPS PTWHPGDLQTQLAVQT KRVAAQVDGG AQVQQVL 

NIECLRDFLTPPLLSVRFRYGGAPQALTLKLPVTINKFFQPTEM 

AAQDFFQRWKQLSLPQQEAQKI FKANHPMDABVTKAKLLGFGSA 

LLDNVD PN P ENFVGAG I IQTKAliQVGCLLRLE PNAQAQ MYRLTL 

RTSKEPVSRHIjCELLAQQF 


5801 


3 


1413 


FPRLYHLlPDtifelTSIKINRVDPSESLSIRtvdGdETPLVHlTl" 

QHIYRDGVIARDGRLLPGDIILKVNGMDISNVPHNYAVRLLRQP 

CQVLWLTVMREQKFRSRNNGQAPDAYRPRDDSFHVILNKSSPEE 

QLG IKLVRKVDEPGVFI FNVLDGGVAYRHGQLEENDRVliAINGH 

DLRYG S P E S AAHL I QAS E RRVHLWS RQVRQRS PDIFQBAGWNS 

NGSWSPGPGERSNTPKPLHPTITCHEKWNIQKDPGBSLGMTVA 

GGASHREWDLPIYVISVEPGGVISRDGRIKI^DILIiNVDGVELT 

KVS RS EAVALLKRTSSS IVLKALEVKEYE PQEDCSS PAALDSNH 

NMAPPSDWSPSWVMWLELPRCLYWCKDIVLRRNTAGSLGFCIVG 

GYEEYNGNKPFFIKS I VEGTPAYNDGRI ROOD XLLAVNGRSTSG 

MIHACLARLLKELKGRITLTIVSWPGTFL 


5B02 


3 


290 


CFSL YQIMERI MDL PTLLRHAFREMFS VGGLFWMFR I RI I LCLM 
GAFFYLISPIiDFVPEALFGIIiGFIjDDFFVIFLLLIYISIMYREV 
ITQRLTR 


5803 


2234 


1299 


B AQFGTTAE I YA YREEQD FGIB IVKVKAIGRQRPK^LEiRTQSD 

G1QQAKVQILP2CVLPSTMSAVQLESLNKCQIFPSKPVSREDQC 
S YKWWQKYQKRKFHCANLTSWPRWIiYSLYDAETLMDRI KKQLRE 
WDENLKDDSLPSKPIDFSYRVAACLPIDDVLRIQLLKIGSAIQR 

LRCE UD imnkctslcckqcqete i ttkne IPS lslcg pmaayvn 

PHGYVHETIjTVYKACNLNLIGRPSTEHSWFPGYAWTVAQCKICA 
SHIGWKFTATKXDKS PQKFWGLTRSALLPTI PDTEDEIS PDKVI 
LCL • 


5804 


2 


1707 


EME^RQE^QRK^T^afeRKRR^kQDMLEKP^lQRSt^aUVEQIE- 

DINNTGTESASEEGDDSLLITWPVKSYKTSGKMKKNFEDLEKE 

REEKERIKYEEDKRIRYEEQRPSLKEAKCLSIiVMDDEIBSEAKK 

ESLS PGKLKLTFEELERQRQENRKKQAEEBARKRLEEEKRAFEE 

ARRQMVNEDEENQDTAKIFKGYRPGKLKLSFEEMERQRREDEKR 

KAEEEARRRIEEEKKAFAEARRNMVVDDDSPEMYKTISQEFLTP 

GKLEINFEELLKQKMEEEKRRTEEERKHKIiEMEKQBFEQLRQHM 

GEEEEENETFGIiSRE YEELIKLKRSGSI QAKKLKS KFEKIGQLS 

EKEIQKKIHEERARRRAIDLEIKEREAENFHEEDDVDVRPARKS 

EAPFTHKWJMKARFEQMAXAREEEEQRRIEEQKLLRMQFEQREr 

DAALQKKREEEEEEEGS IMNGSTAEDEEQTRSGAPWFKKPLKNT 

SWDSE PVRFTVKVTGE PKPE ITWWFEGEILQDGED YQ YI ERQB 

TYCLYLPETFPEDGGEYMCKAVNNKGSAASTCILTIESKN 


5805 


3 


776 


YISDTIiGQVYKSKIRWWIEENGGNGNISVDDLIALLDIiAEHASS 
AFKESQQQS BDRE YE VKERLYPKS KRR YDTYNIAG YQGE I E VGL 
YTIQILQLIPFFDNKNELSKRYMVNFVSGSSDIPGDPNNEYXIiA 

RKVAGYFKKYVDIFCLLEE3QNNTGLGSKFSBPLQVERCRRNLV 
ALKADKFSGLLEYLIKSQEDAISTMKCIVNEYTFIiLK 


5806 


1257 


877 


AVFTFHNHGRTANLYSLHSWLGITTVFLFACQRFLGFAVFLLPW 
ASMWLRSLLKP IHVFFGAAILSLS IAS VISGINEKLFFSLKNTT 
RPYHS L PS EAVFANS TGMLWAFGLLVL YI LLAS S WKRP 


5807 


22S7 


1302 


RFSFOCTFRRPMAVDIQPACLGLYCGKTLLFKNGSTEIYGECGVC 
PRGQRTNAQKYCQPCTESPELYDWLYLGFMAMLPLVLHWFFI EW 
YSGKKSSSALFQHITALFECSMAAI I TLLVSDPVG VLY IRS CRV 
LMI^DWYTWLYNPSPDYVTTVHCTHEAVYPLYTIVFIYYAFCIiV 
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SEQ 
ID 
NO: 


J Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seymeiit containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine f Islsoleucine, K»Lysine, 
•* "=u^aijc/ I'l—i'm i-iiAaai ne $ MaAsparagj.net 
P=Proline, Q«Glutamine, R»Arginine, 
S=Serine, T=Threonine, VoValine, 
W=Tryptophan, Y=Tyrosine, x=Unknown, *=Stop 

Co don . /aDQB?lhl P mini onh^ 

wuwii / / - j«JOaiWiC HUClcOLlQg QciculOIl, 

\=possible nucleotide insertion) 








LMMLLRP Li»VKK IACG LGKSDRFKS I YAALYFFP I LTVLQAVGG 

GLL YYAF P YI ILVLS LVTLAVYMS ASB 1 ENCYDLL VR KKRLI VL 

FSHWLLHAYGI IS ISRVDKLEQ0LPLLALVPTPALFYLPTAKPT 
BPSRILSEGANGH 


5806 


2 


433 


SLPDSGVVEYL^NGGVADNHKDFGELRYl^CliMNFSCNGKNGSS 
EGR ITHGFQliKSAYETJNLMPYTNYTFDPKGVIDYI FYSKTHMNV 

LGVLGPLDPQWLVENNITGCPHPHIPSDHFSIiLTQLELHPPLLP 
LVNGVHLPNRR 


" 58 OV 


464 


; 2422 


ILVPGPQGILHPGVYCAIiQSQHQAQELVADIDECEVSGLCRKGG 
RCVNTHGSFECYCMDGYLPRNGPEPFHPTTDATSCTEIDCGTPP 
EVPDGYI IGNYTSSLGSQVR YACREGFFS VPEDTVSS CTGLGTW 
ESPKLHCQEINCGNPPEMRHAILVGNHSSRLGGVARYVCQEGPE 
SPGGKITSVCTEKGTWRESTLTCTEILTKINDVSIjFNDTCVRWQ 
INSRRINPKIS YVIS I KGQRLDPMESVRBETVNLTTDSRTPEVC 
LAL YPGTNYTVNIS TAP PRRSMPAVIG FQTAEVDLLEDDGS FN I 
SIFNETCLKLNRRSRKVGSBHNnfQFTVLGQRWYIiANFSHATSFN 
FTTREQVPWCLDLYPTTDYTVNVTLLRSPKRHSVQI TIATPPA 
VKQTISNISGFNETCLRWRSIKTAD>fEEMYLFHIWGQRWYQKEF 
AQEMrFNISSSSRDPEVCIiDLRPGTNYNVSLRALSSBLPWISl, 
TTQITEPPLPEVEFFTVHRGPLPRLRLRKAKEKNGPISSYQVLV 
LP LALQS TFS CDSEGAS S FFSNAS DADGYVAAELLAKDVPDDAM 
EIPIGDRLYYGEYYNAPLKRGSDYCIILRITSEWNKVRRHSCAV 
WAQVKDSS LMLLQMAG VGLGSLAWI ILTFLS FSAV 


5B10 


3 


1641 


KVFGTHKDHE VSTLDTAI S AVKVQLAEFLEWLQEKSLRIEAFVS 
B IES FFNT I E ENCS KNEKRLBEQNEEMMKKVLAQYDE KAQS FEE 
VKKKKMEFIJIEQMVHFI^SMDTAKDTLETIVREAEELDEAVFLT 
SFEEINERLLSAMESTASLEKMPAAFSLFEHYDDSSARSDQMLK 
QVAVPQP PRLE PQEPNSATSTTIAVYWSMNKEDVIDS PQVYCME 
E PQDDQEVNELVBEYRLTVKES YCI FEDLEPDRCYQVWVMAVNF 
TGCS LPS ERAI FRTAPSTP VI RAED CTVCWNTAT IRWR PTTPEA 
TETYTLEYCRQHS PEGEGLRS PSGI KGIiQLKVNLQPNDNYFFYV 
RAINAFGTS EQS EAAL I S TRGTRFLLLRETAHPALHI3 S SGTVI 
S FGERRRLTE I PS VLGEEL P S CGQHYWETT VTDCPAYRLG I CSS 
SAVQAGALGOGETSW YMHCSEPQRYTFFYSG I VS DVHVTERPAR 
VGILLD YNNQRLIPINAESEQLLFI IRHRFNEGVHPAFALEKPG 
KCTLHLGIB P PDS VRHK 


5811 


1*18 


851 


AAALADPLPEDKWSAEKRRPI^SIX3YElTPSLIiNPDPKS'HDVY~ 
W D I EGAVRRYVQ P FLNALGAAGN PSVDSQ I L YYAMLGVNPRFDS 
ASSSYYLDMHSLPHV1NPVESRLGSSAASLYPVLNFLLYVPELA 
HSPLYI QDKDGAPVATNAF1ISPRWGGIMVYNVDSKTYNASVLPV 
RVBVDMVRVMEVFIiAQLRLLFGIAQPQLPPKCIiLSGPTSEGLMT 
WELDRLLWARSVENLATATTTLTSLAQLLGKISNIVIKDDVASE 
VYKAVAAVQKSAEELAS GHLASAFVASQ EAVTSS ELAFFDPSLL 

HLLYPPDDQKPAIYIPLFLPMAVPILLSLVKIFLETRKSWRKPE 
KTD 


5812 


520* 


2*44 


GGRQRCQRGRSCGAREBEVEPGTARPPPAASAMDASLEKIADPT 
LAEMG IWLKEAVKMLEDSQRRraEENGKKLISGDI PGPLQGSGQ 
DMVS I LQL VQNLMHGDEDEE PQS PRI QNTGEQGHMALLGHSLGA 
YISTLDKEKLRKLTTRILSDTTLWLCRIFRYENGCAYFHBEERE 
GLAKI CRLAIHSRYEDFWDGFNVLYNKKPVIYLSAAARPGLGQ 
YLCNQLGLPFPCLCRVPCNTVFGSQHQMDVAFLEKLI KDD I ERG 
RLPLLL VANAGTAAVGHTDKI GRLKELCEQYG I WLHVEGVNLAT 
LAI^YVSSSVLAAAKCDSMTMTPGPWLGLPAVPAVTLYKHDDPA 
LTLVAGLTSNKPTDKLRALPLWI^LQYLGLDGPVERIKHACQLS 
Q RLQE S L KKVNY I KI LVEDELS S PVWFR FFQELPGS D P VP KAV 
P VPKMTPSGVGRERHS CDALNRWLGEQLKQLVPASGLTVMDLEA 
EGTCLRFSPLMTAAVLGTRGED VDQLVACI ES KLP VL CCTLOjLR 
EEFKQEVEATAGLLYVDDPNWSGIGWRYEHANDDKSSLKSYPQ 
GENIHAGIiLKKLNELESDLTFKIGPEYKSMKSCLYVGMASDNVH 
AAELVETIAATAREI EDNS RLLENMTEVVRKGIQEAQVEXQKAS 
EERLLEEG VLRQ I P WGS VLNW FSPVQALQKGRTFNLTAGSLES 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


** viu acjjiuciii, ^*-»"ua,mxiiy sxgnax peptide 
(A«Alanine, C=Cysteine. D=Aspartic Acid, E« 
Glutamic Acid, F=»Phenyl alanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q-Glutamine, R^Arginine, 
S-Serine, T=Threonine, V»Valine, 
W-Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TBPIYVYKAQGAGVTLPPTPSGSRTKQRLPGQKPPKRSLRGgnA^ 

LSK'I'SSVSHIEDLEKVERLSSGPEQITLEASSTEGHPGAPSPQH 

TDQTEAFQKGVPHPEDDHSQVEGPESLR 


5813 


2936 


<*99 


HRDGVSGSLERPLTDRSRTGAFAO^RGKMATAGGGSGADPGSRG 
LLRLLSPCVLIAGLCRGNSVERKIYIPLNKTAPCVRLIiNATKOI 
GCQSSISGDTGVIHVVEKEEDLQWVLTDGPNPPYMVLLESKHFT 
RDUMEKLKGRTSRIAGLAVSLTKPS PASG FS PS VQCPNDGFGVY 
SNS YGPK FAH CRE IQWNS LGNGLAYE DPS FP I FLLEDENETKV I 
KQ CYQDHNLSQNGSAPT F PLCAMQLFSHMAWLS FSTAT \ CMRRS 
S I QS TFS INV K I VCDP LSD YNVW SML KP I NTTGTLKPDDRVWA 
ATRLDSRSFFWNV\APGAESAVASFVTQLAAAEAIiQKAPDVTTL 
r kjm vn a v L e t QGSTrDY iGS SRMVYDMEKGKFPVQLiENVDSFVBL 
GQVALRTSLELNMHTD P VS QKNES VRNQVEDLLATLE KSGAG VP 
AVILRRPNQSQPLPPSSLQRFLRARNISGWLADHSGAFHNKYY 
QS I YDTABJf INVS YPEV7LEPLKB/ETWNFG* QDTAKALADVATV 
LGRALYELAGGTN FSDTVQADPQT VTRLLYG \ FLI KANNS WFQS 
ILQGRDLRS YLG* RGL FQH\ YIAV\ S SPTNT X YV/ VLQ YALANL 
TGTWNLTREQCQDPSKVPSENKDLYEYSWVQGPLHSNETDRIiP 
RCVRSTARLARALSPAFELSQWSSTEYSTWTESRWKDIRARIFL 

IASKELELITLTVGFG ILI FSL I VTYCINAKADVLFIAPREPGA 
VSY 


5B14 


8500 


432 

r 

1 


ALKCRPRRVLAXLVGPVQPDRMAEEGAVAVC^VRPI^fSREESL 
GETAQVYWKTHNNVI Y P VDGS KS FNFDRVLHGNETPKNVYEA\ I 
AAPIIDSAIQGYNGTIFA\YGQT\ASGKTYTMMGSEDHLGVIPQ 
GQFHGHFSQKI * EVFLDREFLIiRVS YME I YNBT I TDLLCGTQ KM 
KPLI IRSDVNRNVYVADLTEEWYTS EMALKW I TKGEKSRH YGE 
TKMNQRSSRSHTIFRMILESREKGBPSNCEGSVKVSHIiNLVDLA 
GSERAAQTOAAGVRLKEGCNINRSLFILGQVIKKLSDGQVGGFI 
NYRDSKLTRILQNSLGGNPXTRI I CTITP VSFDETLTALQFAST 
AKYMKNTP YVNEVSTDEALLKRYRKE IMDLKKQLEEVSLETRAQ 
AMEKDQLAQLLE E KDLLQKVQNEK I ENLTRML VTS SSLTLQQ3L 
KAKRKRR\mJCIX3KINKMKNSNYADQFNIPTNITTKTHKLSINL 
LRE IDE SVCS E S D VFSNTLDTL S E I E WNP AT KLLNQEN IBS ELN 

SLRADYD^VLDYEQLRTEKEEMELKLKEKNDLDEFEALERKTK 
KDQEMQLIHEISNLKNLVKKREVYNQDLENELSSKVELLREKEX) 
QIKKLQEYIDSQKLENIKMDliSYSLESIEDPKQMKQTLFDAETV 
AIMKRESAFLRSENLELKEKMKEIAT^^ 
AKKKMQVDtiBKELQSAFNEITKLTSLI DGKVPKDLLCNLELEGK 
I TDLQKELNKEVEENEALRBE VI LLSBLKS LPSEVERLRKEIQD 
KSEELHI ITSEKDKLFSEWHKESRVQGLLEEIGKTKDDLATTQ 
SNYKSTDQEFQNFKTLHMDFEQKYKMVLEBNERMNQE I VNLSKB 
AQKFDSSI^ALKTELSYKTQBLQEKTREVQERLNEMEQLKEQLE 
NRDSPLQTVEREKTLI TEKLQQTLEaSVKTLTQEKDDLKQLQESL 
QIERDQLKSDIHDTVNMNIDTQEQLRNALESLKQHQETINTLKS 
KISEEVSRNLHMEEJNTGEITKDEFQQKMVGIDKKQDLEAKNTQTL 
TADVKDNEI XEQQRKI FSLIQEJQVELQQMLES VI AEKEQLKTDL 
KENIEMTIENQEELRLLGDBLKKQQEIVAQEKNHAIKKEGELSR 
TCDRLAEVEEKLKBICSQQLQEKQQQLLNVQEEMSEMQKKINEIE 
NI^NELKNKELTI^HMETERLEIAQKLNENYEEVKS ITKERKVL 
XELQKS FETERDHLRGYI RE IEATG LQTKEELKIAH I HLKEHQ E 
T I DELRR9 VS EKTAQ I INTQDLE KSHTKLQEE I P VLHE EQELLP 
NVKKVSETQETMNEIiEL LTEQS TTKDSTTLAR IEMERLRLNEKF 
QESQEE I KSLTKERI3NLKTIKEALE VKHDQIiKEHIRETLAKIQE 
SQSKQEQSLNMKEKDNETTKIVSEMEQFKPKDSALLRIEI EMLG 
LSKRLQE SHDEMKSVAKE KDDLQRLQEVLQS BSDQLKENI KE I V 
AKHLETE EELKVAHCCLKEQEE TTNELR VNLSEKETE 1ST I QKQ 
LEAINDKLQNXIQEIYEKEEQLNIKQISEVQEKVNELKQFKEKR 
KAKDSALQSIESKMLELTNRLQESQEEIQIMIKEKBEMKRVQEA 
LQIERDQLKENTKEIVAKMKESQEKEYQFLKMTAVNETQEKMCE 
I EHLKEQFETQKLNLEN I ETENI RLTQILHENLEEMRS VTKERD 
DLRSVEETO^RJXILKENLRETITRDL^KQEELKIVHT^^ 
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SEQ 
ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acia segmenc containing signal peptide 
(A-Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G»Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Meth.ionine, N=Asparagine, 
PoProline, Q=Glutamine, RaArginine, 
S=Serine, T=. Threonine, V-Valine, 
W-Tryptophan, Y-Tyrosine, 3C»Unknown, *=»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








QBTIDKLRGIVSBKTNEISNMQKDLBHSNDALKAdbLkiQEELR " 

IAHMHLKEQQBTIDKXRGIVSEKTDKLSNMQKDLENSNAKLQEK 

IQS LKANEHQL I TDGOVNBTQ KKVSEMEQLKKQ I KDQS LTI*S K 

LEIE1^NIJU3KLHEIIL3EMKSVMKERDNLRRVEBTLFCLERDQLK 

ESLQOTKARDLEIQQELKTARMLSKEKKETVDKLREKISEKTIQ 

ISDIQKDI^KSKDBLQKKIQEl^KKELQLLRVKEDVNMSHKK^ 

EMEQtiKKQFEPNYLCKCEMDNFQLTKKLHESIiEEIRIVAKERDE 

LRR I KE S LKMERDQ F I ATLRE M I ARDRQNHQ VKPE KRLLS DG QQ 

HLMESLRE KCSRI KELLKR YS EMDDHYE CLNRJLS LDLEKEI E FH 

RIMKKLKYVLSYVTKIKEEQHECINKFEMDFIDEVBKQKELLIK 

IQHIjQQDCDVPSRELRDLKLNQNMDLHIEEILKDFSESEFP3IK 

TBFQQVLSNRKEMTQFLEEWLNTRFDIEKLKNGIQKENDRICQV 

NNF FNNRI IAIMNESTE FEERSATI S KEWEQDLKSLKE KNEKLF 

KNYQTLKTSLASGAQVNPTTQDNKNPHVTSRATQLTTEKIRBLE 

NSLHEAKBSAMHKESKI I KMQKELEVTNDI I AKLQAKVHESNKC 

LEKTKETIQVLQDKVALGAKPYKEEI EDLKMKLGKIDLEKMKNA 

KEFEKEISATKATVBYQKEVIRLLRENLRRSQQAQDTSVISEHT 

DPQPSNKPLTCGGGSG I VQNTKAIi I LfCS EHIRLEKE I S KLKQQN 

EQLI KQKNELLSNNQHLSNEVKTWKERTLKREAHKQVTCENS PK 

SPKVTGTASKKXQITPSQCKEBNLQDPVPKESPKSCFFDSRSKS 

LPS PHPVR YFDNS S LG LCP EVQNAGAE S VDS Q P \GP WARL FQG K 

DVP\ECKTQ 


581S 


23 


1460 


SELVMWTVQNRESLGLLSFPVMITMVCCAHSTNEPSNMSYVKET 
VDRLLKG YD IRLRPDFGG P PVDVGMR I OVAS IDMVSEVNMD YTL 
TMYFQQSWKDKRLSYSGIPIJ>TLTLDNRVADQLWVPDTYFLNDKK 
S FVHGVTVKNRMI RLH PDGTVLYGLRI TTTAACMMDLRRYPLDE 
QNC7LEIES YG YTTDD I E F YWNGG EG AVTG VMKI ELPQ FS I VDY 
K14VSKK^FTTGAYPRLSLSFRLKRN1GYFILQTYMPSTLITIL 
SWV S FWINYDASAARVALG I TTVLTMTT ISTHLRETLPKI P YVK 
AIDI YLMGCFVFVFLALLE YAFVNYI F FGKGPQKKGASKQDQSA 
NEKNKLEMNKVQ VDAHGNI LLSTLE I RN BTSGSE VLTS VSDP KA 
TMYSYDSASIQYRKPLSSRB\A*GRAPDRHGVPSKGRIRRRAS\ 
QLKVKI PDLTDVNS IDKWSRM FFP I TFSLFNWYWLYYVH 


5816 


861 


1S1 


TSSRSRAAAQEGDAETPGSVERRGRRAGAEDGMSQAPGAQPSPP 
TVYHERQRLBLCAVHALNNVLQQQLFSQEAADE I CKRLAPDSRL 
NPHRSLLGTGNYDVNVI MAAIiQGLGLAAVWWDRRRPLSQLAliPQ 
VLGLILKLP3 P VS LGLI>SLPLRRRHLRWPCARL/VTVSYYNLDS 
K\IiRAPEGPGGLRTE\ *G P FLAAALAQGLCEVIiLVVTKEVEEKG 
SWLRTD 


5817 


851 


116 


RLFRGPGANRGRSCRGCSGGREPSGGALPKRHCPC*PPSPPAAD 
VMSNTTVPNAPQANSDSM VpY VLGPFFL I TLVGVWAWMYVQK 
JCKRVDRLRHHLLPMYSYDPAEELHEAEQELLSDMGDPKW\QAG 
RVATSTSGCHCWMSRRDLTPLPHPSBPGVLDCLGPCHLLPIiLSP 
QSPCWVLGLHFSLHPPSAASASHALTITSLPPGLLPFVGVELTA 
HPQALMGRGFPSGMAAAGRHLCFI* 


5818 


3 


3918 


QALRDKLWIFLVaSFYAVRHTESWKtMS^DDQQklQAAAFDKGD -- 
DRRLGKKPIFSSSQQRKQVSDSGDIKIKSWRGNNKKECWSYLST 
NKKMKSDGLGASGHSSSTNRNSINKTLKQDDVKEKDGTKIASKI 
TXELKTGGKNVSGKPKTVTKS KTENGDKARLENMS PRQWERSA 
TAAAAATGQ KNLLNG KG VRNQEGQ I SGARP KVLTGNLNVQAKAX 
PIjKKATGKDSPCLSIAGPSSRSTDSSMEFSISTECLDEPKENGS 
TEEEKPSGHKLS FCDS PGQMMKNS VDS VKWSTVAIKSRP VSRVT 
NGTSNK K S I HEQDTNVNNSVLKKVSGKG CSE P VPQAI LKKRGTS 
NGCTAAQQRTKSTPSNLTKTQGSQGESPNSVKSSVSSRQSDENV 
AKLDHNTTTEKQAPKRXMVKQVHTALPKV1JAXIVAMPKNLNQSK 
KGETLNNimSKQKMPPGQVISKTQPSSQRPLKHETSTVQKSMFH 
DVRDNNNiCDS VSEQKPHKPL1NLASEIS DAEALQSSCRP \DPQK 
PLNDQEKEKbALECQNI S KLDKSLKHELE SKQI CLDKS ETKFPN 
HKETDDCDAANI CCHS VGSDNVNS KFYSTTAL KYMVSNPNENSI* 
NSNPVCDLDSTSAGQIHLISDRENQVGRKDTNKQSS I KCV3DVS 
LCNPEETNGTLNSAQEDKKSKVPVEGLTI PS KI*SDESAMDEDKH | 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, DaAspartic Acid, E= 
Glutamic Acid, FsPhenyl alanine, G«Glycine, 
H=Histidine, I=Isoleucina, K«Lysine, 
L=Leucine, M=Methionine, NaAsparagine , 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, x»Unknovn, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








atadsdvsskcfsgqlseknspknmetsespbshetpetpfvgh 
wnl5tgvlhqrespesdtgsattssdd1kprsedydagqsqddd 

GSNDRG I S KCGTMLCH D FIX3RSSSDTSTPEELKI Y DSNIjRI EVK 
MKKQS3NDLFQVNSTSDDEIPRKRPEIWSRSAIVHSRERENIPR 
GS VQFAQE I DQVSSSADETED ERSEAENVAENFS I SN P APQQFQ 
GI INLAFEDATENECREFSANECKFKRSVLLSVDECBEIjGSDEGE 
VHTPFQASVDSFSPSDVFDGISHEHHGRTCYSRFSRESEDNIIiE 

CKQNKBNSVCKNESTVLDLSS idssrknkqsvsatekkntidvl 
ssrsrqllredkkvnngsnvendiqqrskfldsdvks qerpchl 

DLHQREPNS DI PKNS ST KS LDSFRSQVLPQEGPVKESHS TTTE K 
ANIALSAGDIDDCDTLAQTRMYDHRPSKTLSPIYEMDVIEAFEQ 
KVES2THVTDMDF*DDQHFAXQDWTLLKQLLSEQDSNLDVTNSV 
PEDLSLAQYLINQTLLLARDSSKPQGITHI DTLNRWSELTSPLD 
SSAS ITMAS FSS EDCSPQGEMTILELETQH 


5819 


1 


5557 


AAAGLUaAtiHLVMTIiVVAAARAEKE AFVQS ES X X E VLRFDDGG L 
LQTETTLGLSSYQQKSISLYRGNCRPIRFEPPMLDFHEQPVGMP 
KMEKVYLHNPSSE+TITLVSIFATTSHFHASFFQNRKXLPGGNT 
SFDVS/VFLARVVGNVENTLFINTSNHGVFTY\QVFGVGVPNPY 
RLRPFLGARVTVNSSFS PI INIHNPHSEPLQWEMYSSGGDLHL 
ELPTGQQGGTRKLWEI PP YETKGVMRAS PS SREADNHTAFIRI K 
TNASDSTEFIILPVEVEVTTAPGIYSSTEMLDFGTLRTQDLPKV 
LNLHIiIiNSGTKDVPITSVRPTPQ \NDAXTVHFKP ITLKAS \ESK 
YTKVASISFDASKAKKPSQFSGKITVKAKEKSYSKQEIPYQAEV 
LDGYLGFDHAATLFHX RDS PADP VER P X YLTNTFS FAIIjIHD VL 

l pe eaktmfkvhn fs kpvl i lpne sg y i ftllfmpstss mhi dn 
ni llit^askfhlpvrvytgfldyfvlp pkieerfidfgvlsat 
easnilfai inskpielaikswhi igdg\ls3el*vavdrgnrtt 
x i s slpecekss s s dqs s vtlasgyf \avfrvkltakkl \ eg ih 
dgaiqittdyeilti pvk\ aviavgsltcspkhwlpps fpgki 
vhqslnimnsfsqkvkiqqirslsbdvrfyykrlrgijkedlepg 
kkskian i y fd pglqcgdhcyvgl p fls ksepkvqpgvamqedm 
wdad wdlhqslfkg wtg i ken5ghrls a i fevntdlqkn 1 1 s ki 
tablswpsilssprhlkfpltntncss \ eeeitlenp/sqdvpv 
yvq f i pial ysnp s vfvd kl vs r fniis kvakidzirtle fq vfrn 
sahplqsstgfmeg\lsphlilnlilkpgekksvkvk\ftpvhn 
rtvssli i vrnnltvmdavmv^ g^gttenlrvag klpgpgsslr 
fkiteallkdctdslklrepktftlkrtfkventgqiiqihietie 
isg ys cegygfkwncqeftls anasrd i x ilft pdftas rvir 
elkfittsgsefwiimslpyhmlatcaealprpnweiialyx i 
isgimsalfllvtgta\yleaqgiwbp\frrrls\feasnppfd 
vgrpfdlrri vg i s segnlntlscdpghsrgfogaggs s sr ps a 
gshkq*gpsghphsshsnrnsadvddvraynsgrtssmtsaoaa 

S SQPANKTRPLVLDSNTGAQGHSAGRKS KGAKQS QHGS QHHAHS 
PLEQHPQPPLPPPVPQPQEPQPERLSPAPLAHPSHPERASSARH 
S SEDSDITSLIEAMDKDFDHHDS PALEVFTEQPPSPLPKSKGKG 
KPLGRKVXPPKKQEEKEKKGKGKPQEDELKDSLADDDSSSTTTE 
TSNPDTEPLLKEDTEKQKGKQAMPEKKESEMSQVKQjECSKKLLNI 
KKBIPTDVKPSSLELP YTP PLE S KQRRNLPS KIPLPTAMTSOS K . 
SR^QKTKGTSKLVDNRPPALAKFLPKSQELGNTSSSEGEKDS P 
PPEWDSVPVHKPGSSTDSLYKIiSLQTLNADIFLKQRQTSPTPAS 
PSPPAAPCPFVARGSYSSIVNSSSSSDPKIKQPNGSKHiCLTKAA 
SLPGKNGNPTFAAVTAGYDKS PGGNGFAKVSSNKTGFSSSLGI S 
xiMr vuou\aou&siwJafla P vttNFSS PDrTPIiNSFSAFGNSFNIiTGE 
VFSKLGLSRSCNQASQRSWNEFNSGPSYLWESPATDPSPSWPAS 
SGSPTHTATSVLGNTSGLWSTTPFSSSIWSSNLSSALPFTTPAN 
TLAS XG IiMGTENS PAPHAPSTSS PADDLGQTYNP WRI WS PTIGR 
RSS DPWSNSHFPHEN 


5820 


310 


1270 


RVSLSGPVSLGVLLCARSSTMGKRDNRVAYMNPIAMARSRGPIQ 
SSGPTIQ\ VI * IDQGLPGKK* KSN* KRKRK/DSKALAEFEEKMN 
ENWKKKLEKHREKXiSGSESSSKKRQRKKKEKKKSW* \DSSSS\ 
SSSSDSSSSSSDSEDEDKKQGKRRK30CKNRSHKSSESSMSETBS 
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to first 
amino acid 
residue of 
amino acid 
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nucleotide 
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sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C«Cysteine, D-Aspartic Acid, E» 
Glutamic Acid, Pa Phenylalanine, G-Glycine, 
K=Histidine, I-Isoleucine, KaLysine, 
zj-i^eucine, M=Metnionane, N=Asparagane, 
P»Proline, Q»Glubamine, RoArginine, 
S=Serine, T=Threonine, V^Valine, 
WoTryptophan, Y=Tyrosine, ^Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








Ui>iUJbijivivivf^iusraj<j i tiUiM^AKULSKKRKMySEDKPLSSESLS 
ESBYIEEVRAKKKKSSEEREKATEKTKKKKKHKKHSKKKKKKAA 
SS S PDS P *H * E KSGFP YKESAMSEE I STVKTTTYLLKCMNFL VP 
GI I PGLFSSHSDATV 


5821 


173 


915 


KWRNQS WRW PKPGTNWMLS CS VCWRR VTWTGfc VWMRKLGKHPQT 
PT/ 1 KDCS I AATG KRPSARF PHQRRKKRREMDDGLAEGGPQRSN 
TYVIKLFDRSVDLAQFSENTPLYPICRAWMRNSPSVRERBCSPS 
SPLPPLPEI)EEG\SEVTNSKSR*CVOACPPTHTPGGQPK»ACR\ 
SR I PS PLAALRMQGTP * RWS P FE PE PS PSTL I YRNMQRWKRI RQ 
RWKEASHRNQLRYSESMKILREMYERO 


5822 


464 


4379 


QTLKEM P I VMARDLEETAS S S EDHE VI SQEDHP C I MWTGGCRRl 
PVIAf FHADAI LTKDNNI R VIG ER YHLS YKI VRTDS RLVRS I LTA 
HG PHEVHPSSTD YNLMWTGS HLKP FLLRTLSEAQKVNH F PRS YE 
LTRKDRLYKNIIRMQHTHGFKAFHILPQTFLLPAEYAEFCNSYS 
KDRGPWIVKPVASSRGRG\VYIjINNPNQISIiEENILVSRYINNP 
LLIDDF3CFDVRLYVLVTSYDPLVIYLYEEGIiARFATVRYDQGAK 
NI RNQ FMHLTN YS VNKKS GD Y VS CDDP3VED YGNKWSMS AM LRY 
LKQBGRDTTALMAHVEDLI I KTI ISAEIiAJATACKTFVPHRSSC 
FELYGFDVLIDSTLKPWLLEVNLSPSLACDAPLDLKIKASMISD 
MFTWGF VCQDPAQRASTRP I YPTFBS S RRNP FQKPQRCRPLSA 
SDABMKNLVGSAREKGPGKLGGSVLGLSMEEIKVLRRVKBENDR 
RGGFIRI FPTS ETWE IYGS YLEHKTSMNYMLATRLFQDRMTADG 
APELKI * SLNSKAKLHAALYBRKLLSLEVRKRRRRSSRliRAMRP 
KYPVITQPABMNVKTETESEEEEEVAIiDNEDEBQEASQEBSAGF 
LRENQAKYTPSLTALVENTPKENSMKVREWNNKGGHCCKLETQE 
LEPKFNLMQILQDNGNLSKMQARIAFSAYLQHVQI\RLMKDSGG 
QTFS AS WAAKEDEQMELWRFLKRASNNLQHSLRMVL PSRRLAL 
LERTRILAHQLGDFI I VYNKETEQMAEKKSKKKVEEEEEDGVNM 
E>JFQBFIRQASEAELEEVLTFYTQKNKSAS VFLGTHS KIS KNNN 
NYSDSGAKGDHPETIMEEVKIKPPKQQQTTEIHSDKLSRFTTSA 
E KEAKLVYS NS S S G PTATLQ K I PNTHLS S VTTS DL S PG P CHHS S 
LSQIPSAI PSMPHQ PTI LLNTVSASAS PCfcHPGAQNI PS PTGLP 
RCRSGSHT IGP FS SFQSAAH I YS QKLS RP S S AKAGS CYLNKHHS 
GIAKTQKEGEDASLYSKRYNQSMVTAELQRLAEKQAARQYSPSS 
HINLLTQQVTNLNLATGI INRSSASAPPTLRPI IS PSGPTWSTQ 
SDPC2APENHSSSPGSRSLQTGGFAWEGEVENNVYSQATGWPQH 
IClfHPTAGS YQLQFALQQLEQQKLQSRQLLDQSRARHQAI FGS Q T 
LPNSNLWTMNNGAGCRISSATASGQKPTTLPQKWPPPSSCASL 
VPKPPPNHEQVlJUlATSQKASKGSSAEGQLNGI^SSliNPAAFVP 
ITSSTDPAHTKIMNHKHTBKQPVHHSWVHD 


5823 


42 


2293 


LLTALSMEGGGGRDEPSACKAGDVNMDDPKKEDILLLADEKFDF 
0LSLSSSSANEDDEVFFGPFGHKERCIAASLBLNNPVPEQPPLP 
TSESPFAWS PLAGEKFVEVYKEAHLLALHIESSSRMQAAQAAKP 
EDPRSQGVERFIQES KF\ KI NLFE KEKEM KKS PTSLKRETYYLS 
DS PLLGP PVGEPRLLAS SPALPS SGAQARLTRAPGPPHSAHALP 
RESCTAHAASQAATQRKPGTKLLLPRAASVRGRGI PGAAEKPKK 
EIPAS PSRTKI PAE KESHRDVLPDKPAPQAVNVPAAGSKLGQG K 
RAIPVP\NKLGUCKTLLKAPGSYSN\liQRKSSSGA\VWSGASSA 
CTPQPVAKAKS SEFAS I PAN* LPGIiCPNI SKS \GRMGPAM LRP A 
L\PAGPVG\ASSWQAKRVDVSELAAEQLTAPP\SASPTQPQTPE 
GGG\QWLNSSCAWSESSQI2WCTRS IRRRDSCLNSKTKVMPTPTN 
QFKI PKFS IGDS\PDSSTPKLSRAORPOS CTSVGRVTVITQTPVP 
RSSGPAPQSLLSAWRVSALPTPASRRCSGIiPPKTPKTMPRAVGS 
PL\CVPARRRSSEPRKNSAMRTEPTRESNRKTDSR\LVDVSPDR 
G S PPSRVPQALNF S PEES OSTFS KSTATBVARE EAKPGGDAAPS 
EALLVDIKLEPLAVTPDAASQPLIDIiPLIDFCDTPEAKVAVGSE 
SRPLIDLMTNTPDMNKNVAKPSPWGQLIDLSSPLIQLSPBADK 
EKVDSPLLKF 


5824 


42 


2293 


LLTALSMEGGGGRDBPSACRAGDVNMDDPKKEDILLLADEKFDF " " 
DLSLS SSSANEDDEVFFG PFGHKERC I AASLELNNP VPEQP PLP 
TS ES P PAWS PLAGE^FVEVYKEAHLLALH I ESSSRNQAAQAAKP 
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| Predicted ~~ 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, ^Cysteine, DoAspartic Acid, E* 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«LyBine, 
L=Leucine, M-Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SoSerine, T=Threonine, V*Valine, 
N»Tryptophan, Y= Tyro sine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 








BDP RSQG VERFIQESKF \KI NLF BKBKBM KKS PTSLKRETY YLS 
DSPLLGPPVGEPRLLASSPALPSSGAQARLTRAPGPPHSAHALP 
RESCTAHAASQAATQRKPGTKLtiLPRAAS VRGRGI PGAAEKPKK 
EIPASPSnTKIPAEKESHRDVLPDKPAPGAVNVPAAGSHLGQGK 
RAIPVP\NIOGIiKiCTl4LKAPGSYSN\LQRKSSSGA\VWSGASSA 
fU^VAKAKbSEFAS IPAN*LPGLCPNISKS\GRMGPAMI*RPA 
L \ PAGP VG \ ASS WQA KRVD VS E LAAEQLTAP P\SAS P TQ PQTP E 
GGG\QWLNSSC^WSESSQUJKTRSIRRRDSCI^SKTKVMPTPTN 
QFKIPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\CVPARRRSSEPRKNSAMRTEPTRESNRKTDSR\LVDVSPDR 
GSPPSRVPQALNFSPEESDSTFSKSTATEVAREEAKPGGDAAPS 
BALLVDIKLEPtiAVTPDAASQPLl DLPL IDFCDTPBAHVAVGSE 

SRPLIDLMTNTPDMNKNVAKPSPWGQLIDLSSPLIQLSPEADK 
BNVDSPLLKF 


5825 


2 


42lti 


FI^IESASPAPFSSGFZJ^PHSPGGSIATKGRSRLSAPGMIJ^ 

SAAPPAPPPEVTATARPCLCSVGRRGDGGXMAAAGALERSFVKI* 

SGAERER PRH FREFTVCS IGTANAVAGAVKYS ESAGGF Y YVE S G 

KLFSVTRNRFlHWKTSGDTLELMEESLDINIiLNNAIRLKFQNCS 

VLPGGVXVS ETQNRVI Z LML1NQTVHRLLLPH PSRMYRS ELWD 

SQMQSIFTDIGKVDFTDPCNYQLIPAVPGISP1TSTASTAWLSSD 

GEALFALPCASGGIFVLKLPPYDIPGMVSWELKQSSVMQRLLT 

GWMPTAIRGDQ S P5DRPLS LAVHCVEHDAFI FALCQDH KLRM WS 

YKEQMCIWADMLEYVPVKlODLRLTACnCTKLRIAYSPTMGLYIi 

GIF\MHAPKRGQFCIFQLVSTESNRYSLDHISSLFTSQETLIDF 

ALTSTDIWALWHDAENQTVVXYINFEHNVAC5QWNPVFMQPLPEE 

EIVIRDDQDPREMYLQSLFTPGOFTNEAiCKAIiQIFCRGTERNL 

DLSVJSELKKEVTIjAVEKELCjGSVTEYEFSQEEFRNLQQEFWCKF 

YACCI^YQEALSHPLALHLNPHTtmVCXiLKKGYLSFLIPSSLVD 

HL YLLP YEKLliTEDETT I SDDVDI ARD V ICLI KCLRL X EES VTV 

DMS VI MEMS CYNIjQS PRKAAEQ ILEDMTTI DVENVMEDI CS KLQ 

BIRNPIHAIGLLIREMDYETEVEMEKGFNPAQPLNIRMNLTQLY 

gsntagyt vcrgvhki astrfl icrdllilqqllmrlgdavi wg 

rGQLFQAQQDLLHRTAPLLLSYYLIKWGSECLATDVPLDTLESN 
LQHLS VLELT0SGALMANRFV3SPQT IVELFFQEVARKH 1 1 SHL 
FSQPKAPLSQTGLNWPEMITAITSYXiLQtiLMPSNPGCLFLECUfl 
GNCQYVQLQD YIQIiLHPWCQVNVGS CRFMLGRCYLVTGEGQKAL 
ECFCQAASEVGKEEFIiDRLIRSEDGEIVSTPRLQYYDKVLRLIjD 
VIGLPELVIQLATSAITEASDDW\KSOATLNRTCIFKHHI,\DLG 

\HNSQAYGSL* pqi pdssrqldclrqlvwlcersqlqdlvefs 

YVNLHNE WG I IES RARAVDLMTHNYYEIjIjYAFH I YRHN YRKAG 
TVMFE YGMRLGRE VRTLRGLBKQGNC YWVAI1NCLRLI RP EYAWI 
VQPVSGAVYDRPGA5PKRNHDGECTAAPTNRQIEILELEDLEKE 
CSLAR I RLTIAQHD ?S AVAVAGS SS AEEMVTLLVQAGLFjDTA 1 S 
ij^xrrojriJi ^vrt^LiAr AuIKIjQFGGEAAQA^ 1 
5VITTKESSATDEAWRLLSTYLERYKVQNNLYHHCVINKLLSHG 
VPIiPNWLINSYKICVDAAELLRLYIjWYDI»Z»DLTPYQVIRICGC 1 


5826 


3 


871 


ksqllrdhsapppkpctsvgamgc* prq/ bpkeqqrqlkkqknr I 

AMQRSRQKHTDKADAIiHQQHESLEKDNLAIjRKEIQSLQAELAW 
WSRTLHVHERLCPMDCASCSAPGIiLGCWDQAEGIjLGPGPCXSQHG ' 
CREQLEL FQTPGS CY PAQPLS PGPQPHDS PSLLQCPLPS LSLGP 
AWAEPPVQLSPSPLLFASHTGSSLQGSSSKLSALQPSLTAQTA 
PPQPLBLEHPTRGKLGSSPDNPSSALGLARLQSREHKPALSAAT 
WQGLWDPS PHPLLAFPLLSSAQVHF 1 


5827 


194 


2287 


UMGSENSAiKSYTLREPPFTLPSGLAVYPAVLQDGKFASVFVYK J 
REKEDKVNKAAKVP * * HUCTLRHPCLLRFLSCTVEADGIHLVTE 
RVQPLE^^ETIjSSAEVCAGIYDILLALIFLHDRGHLTHNNVCL 
SSVFVSEDGHWKLGGMETVCKVSQATPEFLRSIQS IRDPAS IPP 
BEMSPEFTTLPECHGHARDAFSFGTLVESLLTILNEQVSADVLS 
3FQQTLHSTLLNPIPKWRPALCTLLSHDFFRNDFLEVVNFLKSL 
rLKS EEE KTE FFKFLL DR VS CLS EEL IAS RLVPLLLNQ LVFAE P 
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lj ^ ,au ocyuicnu uuuLaining signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F*Phenylalanine, G=Glycine. 
H»Histidine, I^Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y-Tyxosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VAV\KSFI*PYLLGPKKDHAOGRTPrTiT.<? DAfTkncDW- -q\)t' Vt\t o 
EVHBEHVRMVLLSHIKAYVGALSLREQLKKV\IL\PQVLU3\LR 
D\TSDS I YAI TLHSLAVLVS LLGPEVWGGERTK I FKRTAP \SF 
TK\NTDLS LEGDPFSQP I KFP INGLSDVKNTSEDS ENFPSSS KK 
SEEWPDWSGPB\EPENQTVK1\QIWP\REP\CDDVKSQCTTLDV 
BESSWDDCEPSSLDTKVNPGGGITATKPVTSGEQKPIPALIiSLT 
BE SM P WKS S LPQKI S IjVQRGDDADQI B PPKVSSQERPLKVPS BI* 
GIiGBEFTIQVKKKPVKDPEMDWFADMlPEIKPSAAFLILPELRT 

EMVPICK nnVQ PUMrtTT Q Q VPfc A IVE* TTTTf T?HrV*ltfEiOT5/"»»T mtnn\nf 

ai'ivriuwuvarvnyp o a urivut ax i Msluifiu WhLtiJk5vjEL»Nw EDH2I 
W 


5826 


2 




AR EGGSLGAVAACG ELS YS CD FCPARPHTS WLTRF VKME FQAW 
MAVGGGSR^3TDLTSS I PKPLLPVGNKPLI WV PLNLLERVGFEBV 
IWTTRDVQKALCAEFKMKMKPDIVCIPDDADMGTADSLRYIYP 
KLKTDVLVLS CDLI TD VALHEWDL FRAYDAS LAMLMR KGQDS I 
B P VPGQKG KKKAVEQRD PIGVDS TGKRLLFMANEADLDEELVIK 
GSILQKHPRIRFHTGLVDAHLYCLKKYIVDFLMENG\SITS1RS 
au \irjuv / KuAUeaoAoowUvfi «-K-ttJ^^C»oKGKRGLlCSFRISY 
S FY* KEAN YTGTG AP Y \ D\ ACWI 


4829 


260 


1259 


PDGRLI VSCSEDKT IKI WDrrTNKQCVNNFSDSVGFANFVDFNPS' " 
GTCIASAGSDQTVKVWDVRVNKLLQHYQVHSGGVNCISFHPSGN 
YL1TASSDGTLKILDLLKGRLIYTLQGHTGPVFTVSFSKGGELF 
ASGGADTQVLLWRTNFDEZiHCKGLTKRNLKRLHFDSPPHLLDI Y 
PRTPHPHEBKVETVBDFFLHLLRIiIQSLR* SI CRSLLPLLWISF 
L LI L PQQQKP WGLCQTRVKRPVD IS *TLP * CHQNVCQQPRKRK 
QICT*VTSPVKVK/VSIPIAVTDAI^HIMEQLNVLTQTVS ILEQR 
LTLTEDFaKDCLENQQKLFSAVQQKS 


5830™ 


4436 


3139 


GGKMAAPEBRDLTQEQTEKLLQFQDLTGIESMDQCRHTIiEQHNW 
w i£AA v y ljkjuw cy caj v e t> v r np p RPIjQVWTADHRIYSYVVSR 
PQPRGLLGWGYYLIMLPFRFTYYTILDIFRFALRFIRPDPRSRV 
TDPVGDrVS FMHS FEEKYGRAHPVFYQGTYSQALNDAKRELRFL 
LVYLHGDDHQDS DEFCRNT LCAPE V I SLINTRMLFWACS TNKPE 
GYRVSQALRBNT YPFLAMI MLKDRRE * P V\ VGRLEGLI \QPDDL 
rNQLTFINtQANQTYLVSERLEREERNQTQVLRQQQDFAYLASLR- 
ADQEKERKKREERERKRRKKEEVQQQKLAEERRRQNLQEEKERK 
LECLPPEPSPDDPESVKI I FKLPNDSRVERRFHFSQSLTVIHDF 
LFSLKESP\EKFQIEA\NFPRR\VLPCIPSBE\WPNPPTLQE\A 
GLS HTEVLFVQDLTDE 


5B31 


71 


2897 ■ 


FCSKDKCCLYL P DS INRSKS CrAKPGAHSQDRHAVMDS SRqVkD 
TDDIESPKRSIRDSGYIDCWDSERSDSLSPPRHGRDDSFDSLDS 
FGS RS RQTPS P D WLRG S S DGRG S DS ES DL PHRKL PDVK KDDMS 
ARRTSHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKKABREEYR 
KS WS TATS PAG LGKKALQD YGPRT \ P VS \DDAESTSMFDMRC3E 
EAAVQPHSRARQEQLQLINNQLREEDDKWQDDLARWKSRKRSVS 
QDLIKKBEERKKMEKLLAGEDGTSERRKSIKTYREIVQEKERRE 
RELHEAYKNARSQEEAEGILQQYIERFTISEAVLERLEMPKILE 
RSHSTEPNLSS FLNDPKPMKYLRQQSLPPPKFTATVETT1ARAS 
VLDTSMS AGSGS PS KTVTPKAVPMLTPKPYSQPKNSQDVLKTFK 
VDGKVS VNGETVHREEE KERECPTVAPAHSLTKSQMFEGVARVH 
GSPLELKQI^GSIEINIKKPNSVPQBLAATTBKTEPNSQEDKND 
GGKSRKGNTEIiASSEPOHPTTTVTPP<?OTVaT7VT»T70QCBrtT mm 

VSEBKDQKKPENEMSGKVELVLSQKVVKPKSPBPEATLTFPFLD 
KMPEANQLHLPNLNSQVDSPSSEKSPVTTPFKFWAWDPEEERRR 
QEKWQQEQBRLLQBRYQ\KEQDK\LKEE\WEKAQICEVEEEBRRY 
YEEEP+II\EDPWPFTVSSSSADQLSTSSS^3TEGSGTMNKIDL 
GNCQDEKQDRRWKKSFwGDDSDLLLKTRESDRLEEKGSLTEGAL 
AHSGNP VS KG VKE DHQLDTEAG APH CG TNPQLAQD P S QNQQTSN 
PTHSS EDVKPKTLPLDKS INHQIES PSERRKS ISGKXLCSSCGL 
PLGKGAAMI IETLNLYFHIQCFRCX3\ I CKGQLGDAVSGTDVRIR 
NGLLNCNDCYMRSRSAGQPTTL 


5832 " 


2454 


829 


PGRRFRHGSCAFQKQCIMLHICQYFLQGECKFGTSCKRSHDFSN " 
SENl^KI^KLGMSSDLVSRLPTIYRNAHDIKNKSSAPSRVPPLF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspor.ding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LoLeucine, M=Methionine, N=Asparagine , 
P=Prolina, Q«Glutamine # R=Arginine, 
S=Serine, T=Threonine, VaValine, 
W-Tryptophan, Y- Tyrosine, X»Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VPQGTSERKDS 5GS VS PNTLSQB EGDQ I CLYH I RKS CS FQ DKC H 
RVHFHLPYKWQFLDRGKWEDLDNMEL1EEAYCNPKIERILCSES 
ASTPHSHCLNFNAMTYGATQARRLS TASS VTXPPHFI LTTDWI W 
YWS DE FGS WQE YGRQGTVHPVTTVS S SDVE KAYLAY/ W YTG V* R 
PGSHLEVPGRKAQLRVRFOSLRSEKPGLWHN*KGLPQTQIR\AP 
QDVTT^yrCNTXFPGPKS IPDYWDSSALPDPGFQKITLSSSSEE 
YQ KVWNLFNRT L P FY F VQKI ER VQNLALWE VYQWQ KGQMQ KQNG 
G KAVDERQLFHGTSAI FVDAI OQQNFDWRVCGVHGTS YG KGS YF 
ARDAAYSHHYSXSDTQTHTMFLARVLVGEFVRGNASFVRPPAKE 
GWSNAFYDSCWSVSDPSIFVIFEKHQVYPEYVIQYTTSSKPSV 
TPS ILLALGSLFSSRQ 


5833 


170 


32B9 

r 


S I LCLLS P CWQ FGKP WS I t&SRSRHS PCTKKG WEGMRKHLiHT 
RQGHK* VHVEIS KALWVYRDDYFI RHS ISVSAVIVRAWI THKYR 
GRDWNVKWBENLLHAVAKNYTLLQTI PPFERPFKDHQVCLEWNM 
GYIWNLRANRIPQCPLENDWALLGFPYASSGKNTGIVKKFPRF 
RNRELEATRRQRMDYPVFTVSLWIjYLIiHYCICANLCGILYFVDSN 
EMYGTPS VFLTEEG YLH I QMHLVKGEDIiAVKTKFI I PLKEWFRL 
DISFNGGQIWTTSIGQDLKSYHNQTISFREDFHYNDTAGYFII 
GGS RYVAG I EG FFGPLK YYRLRSLHPAQIFNPLLEKQLAEQ I KL 
YYE RCAEVQE I VS VYAS AAKHGG ERQEACHLHNS YLDLQ RR YGR 
PSMCRAFPWEKELKDKHPS LFQ ALL EMDLLTVP RNQNB S VS E IG 
GKIFEKAVKRLSSIDGLHQISSIVPFLTDSSCCGYHKASYYLAV 
FYETGLWPRDQLQGMLYSLVGG<K5SERLSShINLGYKHYQGIDN 
YPLDWELSYAYYSNIATKTPLDQHTLQQDQAYVETIRT.KDDE II* 
KVO/TKEDGDVFIWLKHEATRGNAAAC^RIJIQMLFWGGXJGVAKNP 

KXAAS KGLKOAVNGIiGWYYHKFKKNYA\KAAKYWLKA\BE\MGN 
PDASYNLGVLHlJ)GrFPGVPGRNQTIJ\GEYFHKAAQQGRNIEGTL 
WCSLYYITGNLETFraDPEKAVVV^KHVAEKNGYLGHVIRKGIiN 
AYLEGSWHEALLYYVLAAETGIEVSQTNLAHICEERPDLARRYX 
GVNCVWRYYNFSVFQIDAPSFAYLFCMGDLYYYGHQNQSQDLEI*S 
VQM YAQAALDGDS QG FFNLALLI EEGTI I PHHILD FLE I DSTIjH 
SNNISILQELYERCWSHSNEESFSPCSLAWLYLHLRLLWGAILH - 
SALI YFLGT FLLS I LIAWT VQYFQS VS ASDPP PRPSOASPDTAT 
STASPAVTPAADA5DQDQPTVTNNPEPRG 


""5834 


17 


4020 


rfrrgggrvfpgafpaspsdslgqgnsqgpprtpkpprt/qecg " 

SAAPGPIPGQSSS*VPLRLEQIQQKADCPLSLBLALKPRMAAQV 
TLBDALSNVDLLEELPLPDQQPCIEPPPSSLLYQPNFNTNFEDR 
NAFVTG IARY I EQATVHS SMNEMLEEGQEYAVML YTWRSCSRA I 
PQVKCNEQP1TOVEIYEKTVEVLEPEVTKIMNFWYFQRNAIERFC 
GE VRRLCHAERRKDFVS EAYL ITLGKF INMFAVLDELKNMKCS V 
KNDHSAYKRAAQFI^KMADPQSIQESQNLSMFIiANHNKITQSLQ 
QQLBVISGYEBIJ*ADrVNLCVDyYENRMYLTFSEKHMLLKVMGF 
GLYLMDG5VSNIY1CLDAXKRINIjSKIDKYFKQIiQVVPLFGDMQ I 
ELARYIKTSAHYEENKSRWTCTSSGSSPQYNICEQMIQIREDHM 
RFISELARYSNSB\nnX5SGRGEAOKTDAEYRKLFDIiAIOGLQLL 
SQWSAHVMEVYSWKLVHPTDKYSNKDCPDSAEEYERATRYNYTS 
EEKFALVEVIAMI KGLQVLMGRMES VFNHAIRHTVYAALQD FS Q 
VTLWEPLRQAIKKKKNVIQSVIjQAIRKWCDWETGHEPFNDPAL 
RGBKDP KSG* D I KVPRRAVGPS STQL YMVRTMLBS LIADKSGS K 
KTLRSSLEGPT ILD I EKFHRES FFYTIILINFSETLQQCCDLSQL 
WFREFFLELTMSRRIQFPIEMSMPWILTDHILETKEAStWEYVIt 

ysldlyndsahyaltrfnkqflydeieaevnu:fdqfvykladq 
ifayykvroagsllu3krlrsecknogatihlppsnryetllkqr 
hvqllgrs idlnrlitqrvsaamykslblaigrfesedlts ive 

LDGLLEINROTHKLLSRYLTLDGFDAMFREANHNVSAPYGRITI* 
HVFWEIiNYDFLPNYCTOGSTNRFVRTVLPFSQEFQRBKQPNAQP 
QYLHGSKALNLAYSS I YGSYRNFVGPPHFQVICRLLG YQGIAVV 
MEEIjLKVVKSLIjOGTIIjOYVKTLMEVMPKICRLPRHEYGSPGIL 
EFFHHQLKDIVB YAE LKTVCFONLREVGNAI LFCLL IEQS LSLE 
EVCDIiLBAAPFQNILPRVHVKEGBRLDAKMKRIiESKYAPIiHLVP 
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SBQ 
ID 

NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location, 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ruiu ocyuKiiu uuntaamng signal peptide 
(A=*Alanine, OCysteine, D«Aspartic Acid, E«* 
Glutamic Acid, P= Phenyl alanine, G*Glycine, 
H=Histidine, I=Isoleucine, KoLysine, 
L=Leucine, M«Methionine, N-Asparagina, 
P= Proline . 0=Glut amine Rn&rainino 
S=Serine, T=»Threonine , v=Valine, 
^Tryptophan, Y=Tyrosine, X=Unknovn, *=Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 








LIERLGTPQQIAIAREGDLLTKERLCCGL>SMFEVIL'rRIRSFLD~ 
DP I WRGPLPSNGVMHVDECVS FHRLWSAMQ FVYC I P VGTHE FTV 
EQCFGDGLHWAGCMI IVLLGQQRRPAVLDPCYHLLKVQKKDGKD 
EI I KNVPLKKMVERIRKFQI LNDE 1 1 T I LDKY LKSGDG EGT PVE 
HVRCFQPPIHQSLASS 


5835 


4209 


1904 


SGNIRmQGSHQIDFQVLHDIaRQKFPEVPEVWSRC^^TJNKti 
DACCAVLSQESTRYLYGEGDLNFSDDSGISGLRNHMTSLKLDLQ 
SQNIYHHGREGSRMNGSRTLTHSlSDGQIiQGGQSNSELFQQEPQ 
TAPAQVPQGFNVFGMSSSSGASNSAPHLGFHLGSKGTSSLSQQT 
PRFNP I MVTIAPNI QTGRNTPTSLKI HGVP PPVLNS PQGNS I YI 
RP YITTPGGTTRQTQQHSGW VSQFNPMN PQQVYQ PS Q PGPWTTC 
PASKPLSHTS5QQPNQQGHQTSHVYMPISSPTTSQPPTIHSSGS 
SQSSAHSQYNIQNISTGPRKNQIEIKLEPPQRNNSSKIiRSSGPR 
TSSTSSSVNSQTLNRNQPTVYIAASP PNTDBLMSRSQPKVYI SA 
NAATGDEQ VMRNQPTLFISTNS GASAASRNMSG Q VS MGPAF IHH 
H P P KS RAI GNN S ATS PR VWTQPNT \ EYTFK I TVS PNKP P A VS P 
GWS PTFE LTNLLNHPDHYVETEN IHHLTDPTLAHVDRISE TRK 
LSMGSDDAAYTQDI *RISNS WLGMVAHACNSSALGGQDGRI I +A 
QBFETSWGNI WRLRLYRRF*NYAGMVAHTCSPS YSVD * ALLVHQ 
KARMERLQRELB IQKKKLD KLKS EVNEMENNLTRRRLKRSNS I S 
Q I PS LEEMQQLRS CNRQLQ I D IDCLTKE IDLFQARG PHFNPS AI 
HNFYDNIGFVGPVPPKPKDQRSIIKTPKTQDTEDDEGAQWNCTA 
CTFIiNHPALIRCBQCEMPRHF 


5836 


361 


2303 


FHITMCGlCCSVNFSAfiriF'sODLkEDLLYNLKQRGPNSSKQLLK 
S D VNYQ Ch FSAHVLHJiRGVLTTQPVE DERGNVFLWNG El FSGIK 
VSAEENDTQIIjFNYLSS CKNES EILSLFS EVQGP WS F IY YQAS S 
HYLWFGRDFPGRRSIiLWHFSNLGKSFCLSSVGTQTSGLANQWQE 
VP AS \ D FS E LILS LLS FPDALF YNC I LGNI FLGRILLKKMLIA* 
VXFQQTYQHLYQR* QMKPNC I LKNLLFL * I * CCHKLHWRLI AV I 
FPMCHLQERYFKSFLLMYT*XEVIOQFIDVIiSVAVKKRVLCLPR 
DENLTANEVUCTCDRKANVAI LFSGGI DSMVIATLADRHI PLDE 
P IDLLNVAFIAEEKTMPTTFNREGNKQ KNKCEI PS EEFS KD VAA 
AAADSPNKHVSVPDRITGRAGLKBLQAVSPSRIWNFVEINVSME 
ELQKIiRRTRICHLIRPLDTVLDDSIGCAVWFASRGIGWLVAQEG 
VKSYQSNAKWLTGIGADEQIAGYSRHRVRFQSHGLEGIjNKEIM 
MELGR 1 SS RNLGRDDRVTGDHGKEARFP FLDENVVSFLKSL P 1 W 
EKANLTLPRGIGEKLLIiRIiAAVSLGLTASALLPKRAMQFGSRIA 
KMEKINEKASDKCGRLQIMSLENLSIBKETKL 


5837 


4792 


903 


KGNAVAQAPVTNCCYIATGSKDQ^IRIWSCSRGRGVMiLKLPFJL ' 

KRRGGG IDPTVKERLWLTLHWPSNQPTQLVSSCFGGELLQWDLT 

QSWRRKYTLFSASSEGQNHSRIVFNLCPIjQTEDDKQLLLSTSMD 

RDVKCWDIATLECSWTLPSLGGFAYSLAFSSVDIGSLAIGVGDG 

MIRVWNTLS I KNNYDVKNFWQGVKS KVTALCWHPT KEG CLAFG T 

DDGKVGIiYDTYSNKP PQISS TYHKKTVYTLAWGPPVP PMS LGGE 

GDRPSLALYSCGGEGIVLQHNPWKLSGEAFDIMKLIRDTNSIKY 

KL PVHTEI S WKADGKIMALGNEDGSI E I FQ\ IPNLKL I CTIQQH 

HKLVNTISWHHE\HGSPAQKLSYL\MPSGSQQCSPFTCHNLKNC 

P*KAAPBSPSDPLQSPYRTPPQGHTAQDYPVWAWEPHIH*WEGL 

VFCFPIDGYSPGCWD\AFPGKEAPVAIFRG\HQGRLLCVAWSPIi 

dpdo:ysg\addfcvhkwltsmqdhsrppc^kksielekk3?lsq 

PKAKPKKKKKPTLRTPVEQCjES IDGNEEBSMKENSGPVENGVSDQ 
EGEEQAREPELPCGLAPAVSREPVICTPVSSGFEKSKVTINNKV 
ILLKKEPPKEKPETLIKKRKARSLLPLSTSLDHRSKEELHQDCL 
VLATAKHSREU^DVSADVEERFmGLFTDRATLYRMIDIEGKG 
HLENGHPELFHQLMLWKGDLKGVLQTAAERGELTDNLVAMAPAA 
GYHVWIjWAVEAFAKQLCFQDQYVKAASHLl^IHKWEAVEIiLKS 
NHFYREAIAIAKARLRP3DPVLKDLYLSWGTVLERDGHYAVAAK 
CYLGATCAYDAAKVLAKKGDAAS LRTAAELAAI VGEDELS ASLA 
LRCAQELLLANNWVGAQ5^QLHESLQGQRLWCIiLELLSRHLE 
EKQLSEGKSSSS YHTWNTGTEGP FVBRVTAVlflCS IFSLDTPEQ Y 
QEAFQKLQN I KYPSATNNTP AKQLLLHI CHDLTLAVLSQQMASW 



3S4 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 

amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine, C=»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HeHistidine, Iolsoleucine, K=Lysine, 
L-Leucine, M«Methionine, N-Asparagine, 
P«Proline, Q«Glutamine, R=Arginine, 
S=Serine, T=Threonine, VaValine, 
WoTryptophan, Y=Tyrosine f X^Unknown, *=Stop 
Codon, /=pos sible nucleotide deletion, 
\=possible nucleotide insertion) 








□EAVQAUJRAVVRSYDSGSPTIMQEVYSAFLPDGCDHLRDiCLGD 
HQS PAT PAFKS LEAFFLYGRLYEFWWSLSR PCPNSSVWVRAGHR 
TLS VE PSQQLDTASTEETDPETSQPEPNRPSELDLRLTE EGERM 
LST FKELFSE KHASLQNSQRTYABVQETLAEM I RQHQKSQLCKS 
TANGPDKNBPEVEAEQPLCSSQSQCKEEKNEPLSLPELTKRLTE 
ANQRMAKFPESIKAWPFPDVLBCCLVLLLIRSHFPGCLAQEMQQ 
QAQELLQKYGNTKTYRRHCQTFCM 


5838 


110 


98 


KTM PHLLVT FRDVAIDFS QEE W ECLD P AQRDli YRDVMLENY5NL 
IS LDLESSCVTKKLSPEKEI YEMES\ PSGRI WGNVSTITFQYNG 
LGDNMECKGNLEGQVS KSEGLYMCVKITCB EKATESHS TS S TFH 

ri i /hyqgki vkckecrqgfs yls cliqheenhni * kcs evnkh 
rntfskkpsyi*hq\kfrlgekpyt;cmecgkafgrtsdliqhqk 
ihtnekpyqcnacgkafirgsqltehqrvhtgekpydckkcgka 
fsycsqytlhqrihsgekpyeckdcgkafilgsqltyhqrihsg 
ekpyeckecgkafilgshltyhqrvhtgekpyickecgkaflca 
sqlnehqrihtgekpyeckecgktffrgsqltyhlrvhsgerpy 
kckecgkafisnsnliqhqrihtgekpykckecgkaficgkqls 
ehqrihtgekpfeckecgkafirvayltqhekihgbkhyeckec 
gktfvratqltyhqr i htgekp ykc kbcdxaf/ hlwlt i ls ehq 
rihrgbkpyeckqcgr/lfirgshl/nehlrthtgekpyeckec 
grafs rgs ehtlhqr i htgekp ytcvqcx3kdfr cpsqltqhtrl 
hn* eysshki cwhs ialasldfahlqeknpen 


5839 


l 


2425 

r 


GRPFPRPPRALPRLPLRGRRQDGRWTVDFEECLKD\SPRFRAAL~ 
EEVEGD VAELEL KL\ DKLVKLC I A \ M I DTG KAFCVANKQFMNG I 
RD\LAQNS \NNDA\WETKFAPS FLDSLQEMINFHTIIi/L* PNS 
EIN* GHSFQNFVKEDIiRKFKDAKKQFE^SQ* KRKKIALVKNAPV 
PSRPAS LEL* KP PNILTATRKCFRH I ALDYVLQ I NVLQS KRRSE 
I LKS MLSFMYAHLAFFHQG YDLFS ELG P YMKDLGAQLDRLVGDA 
AKEKREMEQKHST I QQKD FSRDDS KLKYNVDAANG I VMEGYLFK 
RASNAFKTWNRRWFS IQNNQWYQ KKF KDN P TVWE DLRLCTVK 
HCED I ERRFCFE WS PTKS CMLQADS EKLRQAW I KAVQTS I \AT 
AYREKDDESEKLDKKSS PSTGSLDSGNESKEKLLKGESALQRVQ 
CIPGNASCCDCGLADPRWAS INLGITLCIBCSG IHRSLGVHFSK 
VRSLTLDT WE P E LLKLM CELGNDVINR VYEANVE KMG I K KP QPG 
QRQE KEAYIRAKYVERKFVDKI FL * SLS PP \BQQKK\ FVS KSS E 
EKRLS ISKFGP\GDQVRASAQSSVRSWDSGIQQSSDDGRBSLPS 
TVSANS LYE P BGERQDS SMFLDS KHLNPGLQLYRAS YEKNLPKM 
AEALAHG ADVNWANS EENKAT PL I Q AVLG GS L VT CE FLLQNG AN 
VNQRD VQGRG P LHHATVLGHTGQVCLFLKRGANQHATDEEGKDP 
LSIAVEAANADI VTLLRLARMNEEMRESEGLYGQPGDETYQDI F 
RDFSQMASNNPEKLNRFQQDSQKF 


5840 


698 


3610 


KHLHLPRQHLTTLWQISS PRWRSP. QRAFMSALS KTQTQSAPALQ 
GLSS LLOS VTGNPVPASEAASQSTSASPANTTVYTI KGRNL PS S 
AQPFI PKSFNYS PNSSTSEVSSTSASKAS IGQSPGLPSTAFKLP 
SNTKG FTATHNTS PAAPPTEVTICQSSEVSKPKL\ESESTS PS L 
\2MKIHNFLKGNPGF5VA*NLKHPNPAGSLGSSAPSESHPSDFQ 
RGPTSTS IDNIDGTPVRDERSGTPTQDEMMDKPTSSSVDTMSLL 
S KI I S PGSSTPS S TRS PPPGRDES YPRELSNSVSTYRPFGLGSE 
S PYKQ P S DGMER P S S LMDS S QE KFYPDTS FQEDEDYRDFE YSGP 
P PS AMMNLQKKPAKS 1LKS S KLSDTTE YQ PILS SYSHRAQEFG V 
KSAFPPS VRALLDSSENCDRLSSSPGLFGAFSVRGNBPGSDRS P 
S PSKNDS F bTPDSNHNSLSQSTTGHLSLPQKQYPDS PHPVPHRS 
LFSPQNTLAAPTGH ? PTSG VE KVLAST IS TTST IEF KNMLKNAS 
RKPSDDKHFGQAPSKGTPSDGVSLSNLTQPSLTATDQQQQBEHY 
RI ETRVS S S CLDL PDS TEE KGAP IETLG YHSASNRRMSGEP I QT 
VESIRVPGKGNRGHGREASRVGWFDLSTSGSSFDNGPSSASELA 
SLGGGGSGGLTGFKTAPYKERAPQFQESVGSFRSNSFNSTFEHH 
LPPSPLEHGTPFQREPVGPSSAPPVPPKDHGGIFSRDAPTHLPS 
VDLSNP FTKEAALAHAAPP PP PGEHSG I PFPTPP PP P ? PGEHSS 
SGGSGVPFSTPPPPPPPVDHSGWPPPAPPLAEHGVAGAVAVFP 
KDHS S L LQGTLAEH PGVLPG PRDRGG PTQRDLNG PGLS R VRES L 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amano acid segment containing signal peptide 
(Alanine, C-Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
IIoHistidine, Ieisoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=*Proline, Q=Glutamine, R*Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








RHKVISGrri^YI^PKDmQPSSSFFSISPTSNSSATlARElSOr" 
NGTSSTAEAIGLKGSSPTPPCSPVQPSKQLEYIaARIQGFQVHYC 
DRQSGKECVTCLTLAPVQMTFHArGSS IEASHDQV* YATAILLC 
YG P ARKWKAI KMBAMCAHAALLS L IHYLLAP S ARLBKS KLFALG 

N- 


5846 


1126 




FSKLIKKTFIIGISGVTNSGKTTLAKNLQKHLPNCSVISQDDFF 
KPESEIETDKMGFLQYDVLBALNMEKWMSAISCWMESARHSVVS 
TDQ2S ABE I P I LI I EGFLLFNYKPLDTI WNRS YFLT IPY3ECKR 
RRSTRVYQPPDSPGYFDGHVWPMYLKYRQEMQDITWEWYLDGT 
KSEEDLFLQV YEDLIQELAKQKCLQVTA* RRNTTNPS /CK+ IRK 
LQGVI 


5847 


2769 


sofe 1 


APEMEDI^SPDSTLIWGHNLLSSASFQESVTFKDVIVDFTQlg" 
WKQLDPGQRDLFRDVTLENYTHLVS I GLQ VS KPDVISQLEQGTE 
PWIMEPSIPVGTCADWETRLENSVSAPEPDISEEELSPEVIVEK 
EKRDDSWSSNLLESWEYEGSLERQQANQQTIiPKEIKVTEKTrPS 
WEKGPVNNEFGECSVNVSSNLVTQEPSPEETSTKRSIKQNSNPVK 
KEKSCKCNECGKAFSYCSALIRHQRTHTCEKPYKCN+ /CVEKAF 
SRSENLINHQRIHTGDXPYKCDOCGKGFIEGPSLTOHQRIHTGE 
KP YKCDECGKAFS QRTHLVQHQRI HTGEKPYTCNECGKAFS QRG 
HFMEHQKIHTGEKPFKOECDKTFTRSTHIiTOHCifrrPTopirrw 
CNECGKAFNGPSTFIRHHMIHTGBKPYECNECGKAFSQHSNLTQ 
HQKTHTGE KP YDCAE CGKS FS YWS S LAQHL K IHTGEKPYKCNEC 
GKAFSYCSSLTQHRRIHTREKPFECSECGKAFSYLSNLNQHOFCT 
HTQEKAYECKECGKAFIRSSSLAKHERIHTGEKPYQCHECGKTF 
* STGSSLIQHRKIHTGERPYKCNECGRAFNQNIHLTQHKRIHTGA 
KPYECA3CGKAFRHCSSLAQHQKTHTEEKPYQCNXCEKTFSQSS 
HLTQHQRIHTGEKPYKCNECDKAFSRSTHLTQHQRIHTGEKPYK 

QJECGK\TFSQSTYLIQHQRIHSGEKPFGCNDCGKSFRYRSAZ*N 
KHQRLHPGI 


5848 
5849 " 


22 


2961 

r. 


AAPRRIiLRGGDGDRTPR FPLPALLR PGP PAEAAPERR KM PA VS K 
GDGMRGIiAVFI SDI RNCKSKEAEIKRINKELANI RSKFKGDKAL 
IX^SJOGCYVCKLLFIFIiI^flDI DroHMEAVNLLSSlIRYTEKQIG 
YLF X S VLVNSNS EI» I RL INMAI KNDLASRNPTFMGLALHC IAS V 

GSREMAEAFAGEIPKVLVAGDTMDSVKQSAALCLIiRLYRTSPDL 

VP>K3DWT3RVVHLLNDQHLGWTAATSLITTLAQKNPEEFKTSV 

SLAVSRLS\RIVTSASTDLQDYTY*FCPGFIjGLSVKLIiRLLQCY 

PPPDPAVRGRLTEC^ETILNKAQBPPKSKKVQHSNAKNAVLFEA 

I 3LI I HHDS E PNLLVRACNQLGQ FLQHRETNLR YLALES MCTLA 

SSEFSHEAVKTHIETVTNALKTERDVSVRQRAVDLLYAMCDRSN 

APQrVAEMLSYLETADYSIREEIVLKVAILAEKYAVDYTW\YVD 

TILNLIRIAGDYVSEEVWYRVIQIVINRDDVQGYAAKTVFEALQ 

APACHENLVKVGGYILGEFGNLIAGDPRSSPLIQFHLLH3KFHL 

CSVPTRAliLLSTYIKFVNLFPEVKPTIQDVLRSDSQLRNADVBL 

QQRAVEYLRLSTVASTDILATVLEEMPPFPERESSILAKLKKKK 

G P STVTDL EDTKRDRS VD VNG G P E PAPAS T S AVS TPS PSADLLG 

LGAAPPAPAGPPPSSGGSGLLVDVFSDSASWAPLAPGSEDNFA 

RF VCKNNGVLFENQLLQ IGLKS E FRQNLGRMFI FYGNKTS TQFL 

NFTPTLI CSDDLQPNLFJT QTKPVD PTVEGGAQVQQWNI ECVSD 

FTEAPVLNI QFRYGGTFQNVS VQLP ITLNKF FQ PTEMAS QDFFQ 

RMKQLSNPQQEVQNI FKAKHPMDTEVTKAKI IG FGSALLE3VDP 

NPANFVGAG 1 1 HTKTTQIG CLLRLE PNLQAQMYRLTLRTS KEAV 

SQRLCELLSAQF 




3545 


1895 

] 


KRREIKBTVFHHVAQAGLELLSSSNPPSSASRSAGITGMRHQVQ 
P*DPOISLSPPCFTEEDRFSLEALQTIHKQMDDDKDGGlEVEES 
DEFrREDMKTKI^TNKHSHmREDKHITIEDLWKRWKTSEVHNW 
T LE DTLQWL I E F VEL P Q YEKNFR DNNVKGTTL P R I AVHE PS FMI 
SQLKISDRSHRQKLQLKALDWLFGPLTRPPHNWMKDFILTVSI 
^GVGG CWFAYTQNKTSKEHVAKfWKDLESLQTAEQSIWDLQER 
LBKAQEEKRNVAVEKQNL*RKMMDEINYAKEEACRLRELREGAE 
2ELSRRQYAEQELEQVRMALKKABKEFELRSSWSVPDALQKWLQ 
LiTHEVEWYYNIKRQNAEMQLAIAJTOEMKIKKKRSTVFGTI^ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


**"**" w Dcgmcnt containing signal peptide 
(AaAlanine, C*Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=lieucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
SaSerine, T=Threonine, V«Valine, 
W-Tryptophan, Y-Tyrosine, X- Unknown, *<=Stop 
Codon, /-possible nucleotide deletion, 
^possible nucleotide insertion) 








AHSSSLDEVDHKILEAKKALSELTTCLRERL7RWQQj[^liicX5FQ" 
1AHNSGLPS LTSSLYSDHSWV VMPRVS I PPYPI AGGVDDLDBDT 
PPIVSQFPGTMAKPPGSLARSSSLCRSRRSIVPSSPQPQRAQLA 
PHAPHPSHPRHPHHPQHTPHSLPSPDPDILSVSSCPALYRNEEE 
EBAIYFSAEKQWEVPDTASECDSLNSSIGRKQSPP/SKPRDIPN 
IIS/ DER YQEMRCP * R X PSGG I L 


5850 


3 


1895 


KAVLNFSASGSVISI»TGSNPMHXIASMWHLKKNGIIVYI,DVPLLN 
LI CRLKLM KTDRI VGQNSGTSMKDLL KFRRQ YYKKWYDARVPCE 
SGAS PEBVADKVLNAI KR YQDVDS ETFIS TRHVWPBDCEQKVSA 
BFFIEAVI EGLASDGGLPVPAKBFPKLS CGEWKSLVGATYVERA 
Q I LLE RCI H P AD I PAARLGEM I ETAYGENFACS KIAPVRHLSGN 
QF ILELFHGPTGSFKDLSLQLMPHI FAQCIPPSCNYMILVATSG 
DTGSAVLNGFSRLNKNDKQRIAWAFFPENGVSDFQKAQI IGSQ 
RENGWAVGVESDFDFCQTAI KR IFNDSDFTGFLTVEYGTI LSSA 
KS I NWGRLLPQ WYHASAYLDLVSQG FI S FGS PVDVCI PTGNFG 
KILAAVYAKMMG I PIRKFI CASNQNHVWTDFI KTG\HYBLRGKB 
R* AQTFFTVQ + 1 FLPNLSNLERHLHLMANKDGQLMTELFNRLES 
QHHFQIEKALVEKLQQDFVADWCSEGECLAAINSTYNTSGYILD 
PHTAVAXWAJDRVQDKTCPVIISSTAHYSKFAPAIMQALKIKEI 
NBTSSSQLYLLGSYNALPPLHEALLERTKQQEKMEYQVCAADMN 
VLKSHVEQLVQNQFI 


5851 


3120 




RCYLQFLALLLTSTSARAAAAIAAAEE P AGS PS VMTRAGDHNRQ 
RGCCGSLADYLTSAKFLLYLGHSLSTWGDRMWHFAVSVFLVEIjY 
GNSIiLLTAVYGLWAGSVLVLGAI I GDWVDKNARLKVAQTS L W 
QNVSVI LOG 1 1 LMMVFLHKHELLTMYHGWVLTS CYILI I T I ANI 
ANLASTATAIT I QRDWI VWAGEDRS KLANMNATI RR I DQLTNI 
LAPMAVGQIMTFGSWIGCGFISGWNLVSNC^YVLLWKVYQKT 
PALAVKAGLKEEETELKQLNLHKDTEPKPLEGTHLMGVKDSNIH 
EIiEHEQEPTCAS QMAEPFRTFRDGWVS YYNQPVF/ LGWHGSCFP 
IiYDCPGL*LHHHRVR1«SGTEWFHPQYFDGSISYNWNNGNCSFY 
LATSKMWFGSDRSDLRIGTAFLFDLVCDLCIHAWKPPGLVRFSF 


5652 


1 


422 

. r 


kttfpsslcplrqlpevrgysgqpltdplislcrshkcrgkgwg 
sssypslpallrarsapghcthrscgpewrids i srlemqgarr 
sgwaqaqptilllvprlrkslpsiwg/slmgffitsgpg/wfrq 
yyffisgrh*vlftbsdfyyvamdfgghglsshyspgvpyylqt 
fvs b irrwag kkqs vyfrrcggcs rap p l i tg g g vgs rkqrwp 

ca^/iW/UjAFuiji?AXnGR5yfES 


5853 


223 


1346 


RLLGLSRVKGLHGPAASAWISDPETRGDPGGPWGMWRGSDijkPR 

PVSLTGLTLVCK * AAQGPQ V\ HS VKLCFGLGG\ PCLL\ FPI FR P 

LLLHPRRPRLHPGTRGVAVEPHALRWHVAHGEEAGI RAAGPGH 

GGVEIPQG/VGSLGARRGLRPSRPSSRHRNRVPAPPPGRPLATP 

HRRRFPPDPALTCPGLGQDQGPREQQKQGSGRHDTILGDWGESE 

SRWVRGNPRTGTAATLIGFSRNPTLNGSENWGSLVSIQEEGPDT 

GWEREKRNPAEMGNPORWaQPTHTPPTifiPRTT.onM'OPnT dtsmdp 
vnytvuiuwirncA'ivuiryiuutJr ini tr truss c BluRnnr JBnuKnClJ> IS 

ALGIiRPDPATSVPSALS/QTF/PESWPRSCtiRNQGETLC3^GPVP 
LSSLCITESPSQNWTPCXLLLTCPRGLF 


5854 


86 


938 


kgrntapekkgaaliwrenass*ngy/srWkqdirrienhiiqe " 
lxhlcamikrvllerlbntrklreltegrtldwpqnritevsak 
rqi vte yrekgkrn *eekkrdlegrsrrynlci ig i petedras 

GAETI KDXJjE/ ENFPEIjKNBLDLQMEKAHR I PLKFNEKKAASRH 
IRVTFL / KFQRRNI LQASSQR KQVTYKGAKVRLTS DFS PAI LNA 
RRQW/N/PISRVLRENNFEPRIIYSAKLSFLYKGNWKTFLDIQG 
LGKYINQELSLKILLKDLLQLTENLN 


5855 


536 


2391 


LRSYGCKAPSRISHLHK\FLFIiLLPSLLMGYSESPPPITDSWAP 
FIS LTHHVLSQSQS PLS SNCWI CLSTHTQ * FTALPADLLTWTQS 
NVSLHISYLAIPFLAD9FLKPV/L* PGKSAKHLSFKLSSLSMVS 
GRAVALIjHLIASGLTSIQTNTASSKPPIWGY\I*STQTSFISPPP 
LCLS RTYPNPAHATMVGQVPQSLCGLI FTL/RTP CRPS ILHPNY 
KI ISTS AWQKVLCFS GS PTIHTS LHLTTGSS FLS FHP I PG FPAA 
NSALYVSSLKGPPGKNVTIPSPVTGT*QPPHRGSN/RLTVDKDN 
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SEQ 
ID 
NO: 


I Predicted * 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signaX peptide " 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine # 
H=Histidine, Iolsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Aeparagine, 
P=Proline, Q-Glutamine, R«Arginine, 
S»Serine, T=»Threonine , V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 

CodOn. /=DOSSihle» Tltlr 1 F*flf- H rlo /4alaH4n« 

\=possible nucleotide insertion) 








FFLS PKPNS LHQLPSQ \ TP YQALTGAALAGS YP I WENENTLS WL 
PTFrYNFCLSTPSLPFLCDTN*YLCLPANWSGTCTLVFQAPTIN 
ILPPNQTILISVEAS ISSSPIRNKWALHLITLLTGLG ITAALGT 
GIAGITTSITSYQTLPTTLSNTVEDMHTSITSLQRQLDFLVGVI 
LQNWRVLDLLTTEKGQTCIYLQE£CCFCWESGrVHIAVRRI i HD 
RAAEL*HQVA!3SWWQGSSLl^V7IPWmPFIjGPLIFLFLLLMIGP 
uxrrfDVdKTA t> WKJjWCr I QASMQKHIDN I FHLCHV* YQS LRGNH 
SEAPEPRP 


5856 


173 


1137 


PWLtoGLGLSAVFLFYL* / YVTFHLYGGI I LLLLIFI S IAGIL.YK 
FQDVLLYFPEQPSSSRLYVPMPTGIPHENIFIRTKDGIRLNL.IL 
IRYTGDNS PYS PTI 1 YFHGNAGNIGHRLPNALLMLVNLKVNLLL 
VDYRGYGKSEGEASBEGLYIiDSEAVIiDYVMTSPDLDICTKIYLSG 
RSLG\GAAAIHLASDNSHRISAIMVENTFIjSIPHMASTLFSFFP 
MRYLPLWCYKNKFLSYRKISQCRMPSLFISGLSDQLIPPV>IMKQ 
LYELSPSRTKRLAI FPDGTHNDTWQCQGYFTALEQFI KEWKSH 
SPEEMAKTSSNVTII 


5857 


1597 


. 5Ji3 


KLIGKVLVIiSWADAI^FAVBPQGPAIiGSEPMMlXSSPT^^ 
VNAQFLPGFLMGDLPAPVTPQPRS ISGPS VGVMEMRS PLLAGGS 
PPQPWPAHKDKSOAPPVRSIYDDISSPGLGSTPIjTSRRQPNIS 
VMQSPLVGVTSTPGTGQSMFSPAS IGQPRKTTLS PAQLDP FYTQ 
GDSLTSEDH\I,DDSWGDCIWGFLKASA\SYILL\QFAQYGGIS* 
NMWMSNTGNWMHIRYQSKLQARKALSKDGRI FGES1MIGVKPCI 
DKS VMESSDRCALSS P SLAFTPP IKTLGTPTQPG STPRI S TMRP 
IATAYKASTSDYQVI SDRQTPKKDESLVS KAME YMFG W 


5858 


1 355 ' 


1419 


PPHQPAAASTSXHQQQQPPPPPQDSSKPVVAQGPGPAPGVGSAP 
PASSSAPPATPPTSGAPPGSGPGPTPTPPPAVTSAPPGAPPPTP 
P3SGVPTTPPQAGGPPPPPAAVPGPGPGPKQGPGPGGPKGGKMP 
GGPKPGGGPGLSTPGGHPKPPHRGGGEPRGGRQHHPPYHQQHHQ 
GPP PGG PGGRS EEKISGPRRG FKANLS LLRRPGE KTYTQRCRFC 
LLG I YLLI SRRMNS RRLFAKXWENQBKFLSTKAXDSEFI KLESR 
ALA*NCPKFELG* YTP*GGRQLPSSLFPTHACLPLSCSVI FS PF 

MFPQ^WCWGRKPFRPNLGPHLKGAVCNRWDDPWEGPTGKGHCLN 
FAS 


*B59"" 


307 




GGSSARPRAS SRRMLSRKKTKNEVSKPAEVQGKYVKKETSPLLR 
NLMPS F I RHGPTI PRRTDICLPDSS PNAFSTSGDGWSRNQS FL 
RTPIQRTPHEIMRR2SNRLSAPSYLARSLADVPREYGSSQSFVT 
EVS FAVENGDS GSR YYYSDNFFIX3QRKRPLGDRAHEDYR YYEYN 
HDLFQRM P QNQ GRHAS G I GR VAATS LGNLTNHGS EDLPLPPGWS 
VDWTMRGRKYY IDHNTNTTHW S H PLERE GLPPG WBR VES S E FG T 
YYVDHTNKKAQY\RHPCAPTCTSV*ST?SCIil/AS/RQQTERNQ 
SLLVPANPYHTAEIPDWLQVYARAPVKYDHIIiKWELFQLADLDT 

YQG^^KLLFMKELEQI\^YFAYRQALLTEL^^RKQRQQWYAQQ 
HGKNF 


5860 
5861 "■- 


2956 


1270 " " 


tirvebfplcpgggkaqi^sasllgmij^pptpppLlij^fp 

LLLFSRLCGAIiAGPI FVEPHV TA VWGKNVS LKCL I E VNET I T Q I 
SWEKIHGKSSQTVAVHHPQYGFSVQGEYQGRVLFKNYSLNDATI 

TLHMTfi PJ?nCf2 IfV T f gfl WPPT-ftKTn^Q a m>ni jiw it t ^ 
AJ ** M v» c ououm x tAnV x v k U3N AQSS 1 1 V 1 vL V BPTV SLI KG 

PDSLIDGGNETVAAICIAATGKPVAHIDWEGDLGEMESTTTSFP 
NETAT1ISQYKLFPTRFARGRRITCVVKHPALEKDIRYSFILDI 
QYAPEVSVTGYDGNW FVGRKG VNL KCNADANP PPFKS VWSRLDG 
QWPDGLLASDNTLHFVHPLTFNYSGVYI CiOT\NSPGS KBVTQK 
VHPTFQDPSLPTYP PLPALOFQ WAS PSTA* TSRD\ LATEP* Kl A 
PS PLSTL\ATI KGWTQLPTI I A* CSGVGALFIV\LVKCFGLGI F 
CYRRRRTFRGDYFAKNYIPPSDMQKESQIDVLQQDELDPYPDSV 

kecenknpvnnlirkdyleepektqwnnvenlnrferpmdyyedl 

KMGMKFVSDEHYDSNEDDLVSHVDGSVISRREWYV 




2051 


1305 

: 


EVCACVQAR^VASSGDDSCXSGDKTOCEVGSWVGSWRVVMA^i^ 

SEGEQGI ptacaafaqqpag/bprrglagvgeggpqcs WVNYRC 

rLEFLVSLLGTDLARGRGNSASGPTAPADS KQL/ML*DVHRRVI 
LE*RMNSGSPARDNAPSQRPCTNLSEGLRFGISPSWREALYGCH 
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ID 

NO: 


Predicted 
beginning 
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locat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


(AoAlanine, CsCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G-Glycine, 
H»Histidine, I-Ieoleucine, K=Lysine, 
L»Leucine, M^Methionine , N=Asparagine, 
P»Proline, Q=Glutamine, R^Arginine, 
SoSerine, T=Threonine, Vn Valine, 
W=Tryptophan, Y=Tyrosine, XsUnXnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








A 


5862 


1556 


483 


PPFQLIMGEI KVS PDYNWFRGT VP LKKI IVDDDDS KI WS L YD AG ' 
PRS IRCPLI FLP PVSGTADVFFRQILALTGWGYRVIALQYPVYW 
DHLEFCDG FRKLLDH LQL DKVHLFGAS LGG FIAQKFAE YTHKS P 
RVHSL I LCNS FSDTS IFNQTWTANS FWLMPAFMIrfGGVLGNFSS 
GPVDPMMADAIDFMVDRLESLGQSELASRLTLNCQNSYVEPHKI 
RDI P VTIMDV FDQS ALSTEAKEEMYKL YPNARRAHLKTGGKP P Y 
LCRSAEVNLYVQ I HL/R / RNS ME PNTR PLTHQWS VPRS LRCRKA 
ALASARRSSS VSLAVNDBLTRCVLV* SVASAPVSRPFPSGSS GS 
PVLTVSGK 


5863 


2714 


249 


PFPSRGSLPIJUVPREDTMGPLMVLFCI^jFLYPGIADSAPSCPQN 
VNISGGTFTLSHGWAPGSLLTYSCPQGLYPSPASRLCKSSGQWQ 
TPGATRSLSKAVCKP VRCPAPVSFENG I YTPRLGS YPVGGNVSP 
ECBDGF I \LRGS PVRQCR PNGMWDGBTAVCDNG AGHCPNPG I SL 
OP \ VRTGFRFGHGDKVRYRCSSNLVLTGSS ERECQGNG VWS GTE 
PI CRQPYSYDFPEDVAPALGTSFSHMIiGATNiJTQKTKESIiGRKI 
QIQRS GHLNL YL LLDCS QS VS ENDFL I F KESAS LM VDR I FS F3I 
NVSVAIITFASEPKVLMSVLNDNSRDMTEVISSLENANYKDH3N 
GTGTNTYAALNSVYLMMNNQMRLLGMETMAW\QEIRHAI I LL \T 
DG K \ S HMGGS P KTAVDH I RE I LN INQ KRND YLDI YAI GVG KLDV 

CGVGNMSANASDOERTPWHVTIKPKSQET\C\RGALISDQWVLT 
AAHCFRDGNDHSLWRVNVGDPKSQWGKEFLIEKAVISPGFDV^A 
KKNQGIL\EFYGD\DIALL\KLAQK\^\STHCC^PSCLP\CTM 
\EANLGFLRETFKGSTCR\DHENEL/VWNKQSV\PAHF\VAi\N 
GSKLEHLTLRMGVE WTS CCRGLSP KKKTM \FPNLT \ DVRB\ WT 
D\QFL\CS \GPQEDESP\CX*E\SGGA\ VFXEKRFRLSAGGVWC 
SWGI*\YNP\CTiGSA\DKNS PKKGPSVAKVPPPTR/DFHIN\LPP 
Q * S PWLRQHPGGMS * 1 FLPLLANGHLS P FACPAR I CRPLEFLPS 
EWATLRTL 




173 


1013 

» 


PIiISVPQSLISLPQPliLCFPGGQEPSAPSPCLYSFLWACSFTMG 
KLPPSIPPSSPLACVLKNIiKPl^IiTPDLKPKCLIFFCNTAWPQY 
KLDNDSK* PENGTFE FS ILQ VLDNS CHKMGKWS EV PD VQAF F \ S 
HWSLPSLCSQC/GLIPNLSSFSPFCSFG/PPPQVPSP/TESFFS 
MDSS DLP PS PQAAPRQAEPG PNSHLAS AP PPYHP F I TSPPHTWS 
SLQFHSVTSPPPPAQQFTLKKVAGAKGIVKVSAPFSLSQIR*RL 
GSFSSNIKIQPSSWLIWQQP 




568 


1684 


CLPGPRWGEGWRAGHTIVGCIFFKTAIISHFKGGMYJbCVCMCTC 
LSVCVCVQVGS WI CV/CVSMCACVSLCTC\ ICRCISMYTREHAC 
ACTRV*VYMCMS/VCTCVSTCIDVRVCAHVCVYMCLCLGYA*AC 
TCV*MCVCMHEHVCMC/ VCACS CVLL/CRGHICM/MCMS AYI C I 
/CVYVCVLCVWACMRMS^C^LVYG*ACTCV^^ C 
VHVCCMSMHACBCLCVYLHICGCAGTRRWWAGSARGSRSCSRLP 
CWAPGPGLSLPGPSCPSVEOGLGGGPGQLQGRSGKRRLGEHRGW 
GSPAAVCSRNCTVS PRRGADCFEAPDVP KQPPGWGRAS FEE RG C 
GGRGW VCAPPLNGPQCCCFSI KPELKAKXKK 


5B6S 


98 


3197 


ARPEVPAP PAWLS RRGAAKMGDKKDDKDSPKKNKGKERRDL DDL 
KKEVAMTEHKMS VEE VCRKYNTDC VQGLTHSKAQE I LARDG PKA 
LTPPPTTPEWVKFCRQLFGGFS ILLWIGAI LCFLAYGIQAGTED 
DPSGDNLYLGIVLAAWT ITGCFS YYQEAKSSKIMESFKNMVPQ 
QALVIREGEKMQ VNAEEVWGDL VE IKGGDRVPADIiR 1 1 3 AHG C 
KVDNS S LTGESE PQTRS PDCTHE\NPLKTRNITFFS NNFVEGTA 
RGVWATGDRTVMGRIAXLASGLEVGKTPIAIEIEHFIQLITGV 
AVFLGVS FFILSLILGYTWLSAVI FLIGI IVANVPEGLLATVTV 
CLTLTAKRMARKNCLVXNLEAVBTLGS'f STI CSDKTGTLTQNRM 
TVAHMW FDNQIHEADTTEDQSGTS FDKS SHTWVALF * H/LLG FC 
NRPVFKGGQDNIPVLKRDVAGDASESAI*LKCIELSSGSVKLMRE 
RNKKVAEIPFNSTNKYQLSIHETEDPNDNRYLLVMKGAPERILD 
RCSTILIiQGKEQPLDEEMKEAFQNAYliEIiGGIX3ERVLGFCHYYI* 
PEEQFPKGFAFDCDDVNFTTDNLCFVGLMSM1GPPRAAVPDAVG 
KCRS AG I KV I MVTGDHP I TAKAXAKG VG 1 1 FBGNETVED IAARli 
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nucleotide 
location 
corresponding 
to first 
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residue of 
amino acid 
sequence 


Amino aCid SCOin&nt Pnnhalninrr ti^/mal r>A M t> " 

* v * otyiMciiu voutdininy signal, peptide 
(A=Alanine, .C=Cysteine, D*Aspartic Acid, E= 
Glutamic Acid, Phenylalanine , G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
J.=Leucine, M=Methionine, N-Asparagine, 
P«=Proline, Q»Glutamine, R«Arginine, 
SaSerine, T«Threonine, V^Valine, 
HaTryptophan, Y*Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 










NrPVSQVNPRDAKACVIHGTDLKDFTSKQIDEltQNHTEIVPAR 
TS PQQKLI I VEG CQRQGAI VAVTGDGVNDS PAL KKAD IGVAMG I 
AGS DVS KQAADM I LLDDNFAS IVTGVEEGRLIFDNLKKS IAYTL 
TSNiPEITPPLLFIMANIPLPLGTITILCIDljGTDMVPAISIiAy 
BAAESD I MKRQ PRNP RTDKLVNERL I SMAYGQIGM IQALGGFFS 
YFVILAENGFLPGNLVGIRLNWDDRTVNDIiEDSYGQQWTYEQRK 
VVEFTCHTAFFVSIVWQWADLIIOCTRiOJSVFQQGMKNKILIF 
GIjFZETALaAFXiSYCPGMDVALRMYPLKPSWWFCAFPYS flifv 
YDEIRKLILRRNPGGWVEKETYY 




5867 


3 


146£ 


LPGRRARGGRGLGW PPAQAZiDGS RMGKAKVPAS KRAPS S ? VAKP 

GPVKTLTRKKNKKKKRPWKSKAREVSKKPASGPGAVVRPPKAPE 
DFSONWKALOEWliIiKOlf^OAP P if PT.VT crtM/> c rvmvTT/vwww 

BTS PQVKGEEMPAGKDQEASRGSVPSG5KMDRRAP VP RTKASGT 
EHNKKGTKBRTNGDIVPERGDIEHKKRKAK\GQPQPHPPR/IDI 
WFDDVDPADIEAAIGPEAAKIARKQLGQSEGSVSLSLVKEQAFG 
GLTRALA1J)CEMVG VGPKGEBSMAAR VS I VTfQ YGKCVYDK YVKP 
TE P VTD YRTAVSGIRPENLKQG EELE WQKEVAEMLKGRI LVGH 
ALHNDLKVLFLDHPKKKIRDTQKYKPFKSQVKSGRPSLRLLSEK 
a uv?jju v wwviiriL^> iyiJAyA/WRliXVM VKKEWES MARDRRPLLTA 

PDHCSDDA*QSCPAAAAAPLQRQCDQSQGQITSPQSGNSGETFS 
ESWQRGVAWCY 




5868 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNL/CVTNAMREDLADIWYIR 
AVTVYDKPAS FFKETPLDLQHRLFM KLGSMHS P FRARS EPEDPV 
TERSAFTERDAGSGLVTRLRERPAUiVSSTSWTBDEDFSILLAA 
LESRV* T\MTLDGHNL PS LVCVI TGKGPLREYYS RLIHQKHFQH 
» i r nu£u\Ctij x t* xjixLteaAiJijj VCiinTSSSGI^LPMKVVDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRianaRESQQLRWDESWVQTVLPLVMDT 


5869 


2122 


833 


ltagashtqdasqstsakypaaaqnl/ cvtnamredladi w y ipT~ 
avtvydxpasffketpldlqhrlf^klgsmhspfrarsepedpv 
ters afterdagsglvtrlre rpallvss ts wtededfs i llaa 

LESRV* T\MTLDGHNLPS LVCVITG KG PLRBYYSRLIHQKHFQH 

I0VCTPWLEAEDYPr.TiT.f3Q AriT/TUYT UTCQCnrm DMinnmMnm 

CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5870 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNL/CVTNAMREbLADIWYilR 
AVTVYDKPAS FFKElTIiDLQHRLFMKLGSMHSPFRARSEPEDPV 
TER5AFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSI LLAA 
LESRV*T\MTLDGHNLPSLVCVTTGKGPLREYYSRLIHQKHFQH 
IQVCTPWLEAEDYPLLLGSADLGVCLKTSSSGLDLPMKWDMFG 
CCLPV CAVNFKCLHEL VKHE ENGL VF EDS EE LAAQLQMLFS N FP 
DPAGKLNQFRKNLRESQQLRWDBSWVQTVLPLVMDT 


5871 


3 


3465 

1 


FFrcRPLRLYSKTTGDRSAMAriAAGLTAEVSMKVLERRARTKR's 
VLKLL* LSLRRL*LEPT I *NGLLT*CSRLSVFRFLKV\GSVYE P 
LKS INLPRP DNETLWDKLDHYYRI VKS TLLLYQS PTTGLFPTKT 
CGGDQKAXIQDSLYCAAGAWALALAYRRIDDDKGRTHELEHSAI 
KCMRGILYCYMRQADKVQQFKQDPRPTTCLHSVFNVHTGDELLS 
YEEYGHLQINAVSLYLLYLVEMISSGLQI I YNTDEVS FIQNLVF 
CV\ERVYRVP\DFG\VWGKREGKYY*/SGSTELHSSSVGLGKRQ 
L*KQFNGFNLFGNQGCSWSVI FVDLDAIDfRNRQTLCSLLPRESR 
SHNTDAALLPCISYPAFALDDEVLFSQTLDKVVRKLKGKYGFKR 
FLRD3 YRTS LEDPNRC YYKPAE I KLFDG I ECEFP I FFL YMMIDG 
VFRGNP KQ VQE YQDLLTP VLHHTTEG YPWPKYYYVPADF VE YE 
KNNPGSQKR FPSNCGRDGKLFLWGQALYI IAKLLADEL IS PKDI 
DPVQRYVPLKDQRNVSMRFSNQGPLENDLVVHVALIAESQRIiQV 
FLNT YG I QTQTPQQVEP IQ I W PQQELVKAYLQLG INE KLGLSGR 
PDRPIGCLGTSFCIYRILGKTWCYPI IFDLSDFYMSQDVFLLID 
DIKNALQFI KQ YWKMHGRPLFLVL IREDN I RGS RFNP ILDMLAA 
LKXG 1 1 GGVKVHVDRLQTLISGAWEQLDFLR I SDTE ELPEFKS 
PEELEPP KHS KVKR QS STPSAP ELGQQPDVNIS EWKD KPTHEIL 
3KLNDCSCI«ASQAILJ^ILLKREGPNFITKEOTVSDHIERVYRR 
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SEO 
ID 

NO: 


beginning 
nucleotide * 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rrcQiccea sua 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Amino acid segment containing signal peptide 
(Alanine, C=Cysteine, D=Aspartic Acid* E« 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, ^Methionine, NsAsparagine, 
P*Proline, Q=Glutamine, R=*Arginine, 
S=Serine, T=Threonine, VaValine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=»possible nucleotide insertion) 








AGSQKLWS VVRRAASLLS KWDS LAPS I TNVLVQGXQ VTLGAFG 
EBEEVISNPLSPRVIQNIIYYICCNTHDEREAVIQQELVIHIGWI 
ISNNPELFSGTLKIRIGWI IHAME YELQIRGGDKPALDLYQLSP 
b B v KQiiiiiiU I LQPTONGRCWLNkRQI dgslnrtptgfydrvwqi 
LERTPNGI I VAGKHLPQQPTLSDMTMYEMNFSIiLVEDTLGNIDQ 
PQ YRQI WBLLMWS I VLERNPELEFQDKVDLDRLVKEAFNEFQ 
KDQSRLKEI EKQDDMTSFYNTPPLGKRGTCS YLTKAVMNLLLEG 
EVKPNNDDPCLIS 


5872 


68 


665 


VQGYMYRFVIKIN5CYSEKTS 1 CRHRCCPELPATQPWPTPTVFP 
NIAIDSESLGCI \SFKLFADKV/ PKRWKKNFVLLNTGEKVLGDK 
G P CFYR 1 1 PG \LCQGGDFTHHNGTGGKSL YS KEFDDENFI / liKH 
TAPGVLSTANAGPTTNGSQFFI CTAKTEDG*QHWFGKVKDGMS 
IVEALERSGSRNGKTSKKI TAANCGQL 


5873 


2240 


506 


RRPPEGGSGGGRRTRARMPLPWSLALPIjLLSVn/AGG'FGtiAASAR 
HHGLXiASARQPGVCHYGTKLACCYGXflRRNS KGVCEATCE PGCKP 
GBC VG PNKCRCF PG YTGKTCSQDVME CGMKPRPCQHRC VNTHGS 
YKCFCLSGHMLMPDATCVNSRTCAMINCQYSCEDTEEGPQCliCP 
SSGLRLAPNGRDCLDIDECASGKVICPYNRRCVNTFGSYYCKCH 
IGFELQ YISGRYDCID INECTMDSHTCSHHANCFNTQGS FKCKC 
KQGYKGNGLRCSAIPENSVKEVLRAPGTIKDRIKKEjIiAHKNSMK 
KKAKI KNVTPE PTRTPTP KVNLQp FNYEETVS RGGNSHGG \ KKG 
NEEKMKEGLE DEKREEKALKD*HRRERP FRG\ DVFFPKVNE AGE 
FGL I L \ VQRKALTS KLE^ KADLN I S VDCSFNHG \ ICDW \ KQDR\ 
BDDFDW\NPADR\DNAI\GFY\MAVPGLWQGHK\KDIGRLKLLL 
PDU)PQSMFCLLFDYRIAGDKV^KLRVFVKNSNNALAWEKTTSE 
DBKWKTGKIQLYQGTDATKS 1 1 FEAERGKGKTGE IAVDGVLLVS 
GLCPDSLLSVDD 


5874 


2 


3387 


ACPRLARRRRR VRS LRRRRGWLRARWSRGQHNMAARRI TOE TFD 
AVLQEKAKRYH^ASGEAVSHrnjQFKAQDLLRAVPRSRABMYDD 
VHSDGRYSLSGSVAHSRDAGRESLRSDVFSGPSFRSSNPS ISDD 
S YFRKECGRDLE FSHSNS RDQVIGHRKLGHFRSQDWKFALRGS W 
EQDFGH PVSQES S WS QEYS FGPS AVLGDFGS SRL I EKECLEKE \ 
SRDYDVDHSG\EA\DSVLRG8\SQVQA\RGRALNIVDQEGSLLG 
. KGETQGIXTAJ03GVGKLVTLRNVSTKKIFrVNRITPKTQGTNQI 
QKNTPS PD VTLGTNPGTED IQFPIQKI PLGLDL KNLRLP RRXMS 
FDI IDKSDVFSRFGI E 1 1 KWAG FHT1 KDD I KFSQLFQTLFBLET 
ET CAKM LAS FKCSLKPEHRD FC FTTIKFLKHS ALKTPRVDNEFL 
NMLLDKGAVICTKNCFFEI I KPFDKY IMRliQDRLLKS VTPLLMAC 
NAYELSVKMKTLSNPLDLALALSTTNS LCRKSLALLGQTFSLAS 
SFRQEKIL*AVGLQDIAPSPAAFPNFBDSTLFGREYIDHLKAVIL 
VSSGCPLQVKKAEPEPMREBEKMIPPTKPEIQAKAPSSLSDAVP 
QRADHRWGTIDQLVKRVI EGSLSPKBRTLLKEDPAYWFLSDEN 
SLEYKYYKLKLAEMQRMSBNLRGADQKPTSADCAVRAI'ttiYSRAV 
RNLKKKLL P \ WQRRGLLRAQG \ IiRG \ W KARRA\TTGTQTLLFI» R 
APGLKHHGRQAPGLS \QAKPS LPDRND\AAKD\CPLDPV\GPS P 
QD PS LEAS GPS PKPAGVDIS EAPQTSS PCPS ADIDMKDNGRTAE 
KIiARFVAQ VG \ PE I EQF \£ I \ ENS TDN PDLWFL\KDQNS S \AFK 
FY\RKKVFELCPS I CFTSSPHNL\HTGGGDTT\GSQESPVDLME 
GEAEFEDEPPPREAELESPEVMPEEEDEDDEDGGEBAPA\PGRG 
GPS LEGS T PADGLPGEA\ AEDDL /ALGAPALFTGLLQVTCFP FG 
RGFSSKSLKVGMIPAPKRVCLIQEPKVHEPVRIAYDRPRGRPMS 
KKKKPKDLDFA00KI»\TDK\NliGFO\MIxOKMGWKEGRni/;«!T v 
gir\srsactqqaawggsgwgi*s PSTCSLPLGSFTAKMAYSWQL 
IFVF 


5875 


296 


1848 


XiAALGGLPLWRLS RRG FREY LLG LSAPSALGGAMRS VS YVQRVA ' 
LE FS GS L FPHAI CLGDVDNDTLNEL VVGDTSGKVS VYKNDDSRP 
WLTCSCQGMLTCVGVGDVCNKGKNLLVAVSAEGWFHLFDLTPAK 
VLDASGHHETLIGEEQRPVFKQHI PANTKVMLI SDIDGDGCREL 
WG YTDRWRAFRWEELGEGPEHLTGQLVS LKKWMLEGQ VDS LS 
VTLGPLGLP ELM VSQPGCAYAILLCTWKKDTGS PPASBGPTDGS 
/ S GDP S CPRRG AAPD I W P Y PQQB CLHS PNWQHQT\ SHGTES S GS 
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~ SEQ 
ID 

NO: 


~~ Predicted 
beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment contain-! -na" «■! rm^ 1 — i ~ 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
HeHistidine, I=Isoleucine, KoLysine, 
L=Leucine, ^Methionine, N^Asparagine , 
P=Proline, Q=Glutamine , R=Arginine, 
SoSerine, Threonine, V«Valine, 
W«Tryptophan, Y=Tyrosine, X«Unknown, **Stop 
Codon, /-possible nucleotide deletion, 
^possible nucleotide insertion) 


" 587* ■ 






GLFAI^ll,UGTLKLMBEMEEADKIiLWSVQVDHQLPALEKLDV , i J G — 
NGHEEWACAWDGQTY 1 1 DHNRTWRPQVDEMIRAFCAGLYACK 
EGRNS PCLVYVTPNQKI YVYWEVQLERMBSTNLVKLLETXP \ ST 
TACCRSWAWILTTSL * LVPCFTKRSTIQTSHHS VLPQASRI P PS 
WTCL IAGBGFF* TPTLP PKGVFGS HCAAAGS ITKQ 




1122 


224 


HLPLGVPSKVAGAAAMEPQEBRETQVAAWLKKIFGDHPIPQYBV 
^RTTEILHHLSfiRNRVRDRDVYLVIEDLKQKASEYESEAKYLQ 
DLLMES VNFS PANLS S TG SRYLNALVDS AVALETKDTS LAS F I P 
AVNDLTSDLFRTKSKSEElKIELEKLEKNLTATT.VT.RTrrr rnmr 
KKAELHLSTER\AKVDNRRQNM\DFLKAKSEEFRFG IQAAGEQL 
SARGQ\DAFSVPIQSLVAt.IRENWPRLKQQTIPLK\KKLESYLD 
LMP \NPSHCSK*RIBEAK\RELA\S IEABLTRRV9\MMEL 


5877 


2030 


1907 


GTLGKMAAS SSGEKEKERLGGGLGVAGGNSTRERLLSALEDLBV 
LSRELIE>IIAISRNQKIjLQAGEENQVLEIiLIHRDGSFOELMKLA 
LNQGK I HHBMQVLEKE VEKRDSDI QQLQKQLKEAEQI LATAVYQ 
AKEKIiKSIEKARKGAISSEEIIKYAHRISASNAVCAPLTWVPGD 
PRRPYPTOLEMRSGLLGQMNNPSTNGVNQHLPGDALA/RRKIAR 
CPCSTVS/NGSQMTCR* INI ILIIjQKSVCEL 


5878 


950 


2113 


GLWKCMQLQC5PHTHRVQP * PTPRQQGPQ \ VPVAVIAGNRPNYLY 
RMLRSLLSAQG VS PQMI TVFIDG YYEEFMDWALFGLRGI QHTP 
IS I KNAR VSQHYKASLTAT FNLFPEAKFAWliEEDLD IAVDF FS 
FLSQS I HLLEEDDS LYC I S AWND QG YEHTAED PALL YR VETMPG 
IiGWVl^RSLYKJBELEPKV7PrPEKLWDWDMWMRMPEQRRGRECII 
PDVSRSYHFGIVGLNMNGYFHEAYFKKHKFNTVPGVQLRNVDSL 
EGCEAYEVEVHRLLS EAE VLDHS KNP CEDS FLPDTEGHTYVAF IR 
ME KDD D FTTWTQLAKCLH I WDLD VRGKfWPGT .WW r . PD vvivru ttt xnr 
GVPAS PYSVKKPPSVTPI FLEPPPKEE3APGAPEQT 


5879 


3 


981 


RLTBAAAAGs«5RAAGWAGSPPTLLPLSPTSPRCAATMASSDED 
GTNGGAS EAGE DREAPGKRRR LG FLATAWLT FYD I AMTAG W L VL 
AIAMVRF YMEKGTHRGL YKS IQKTLKFFQTPALLE IVHCL I G I V 
PTE VI VTGVQVS SR I FMVWL ITHS I KP I QNEES WLFLVAWTVT 
BITRYS F YTFSLLDHLPYFI KWARYNFFI ILYPVGVAGELLT I Y 
,AALPHVKKTGMFS IRLPNKYNVS FDYY YFLLITMASYI PLFPQL 
YFHMLRQRRKVLHG\G*L* KRMIK*SLQTRCFFQNNQDYLS PS F 
NNKNKQLCEISWIVWFLKI 


5880 
"5881 " 


1138 


1324 


' t ajv^^^vj j.j_trt H xJt'K£»>il<.vj 1 roALTACoA 
S VTS KG KS SS GMW PS AASDRDS P VP LRP PG PVQL PSGTGW VLS D 
♦KKKRGRCSS/WLSQPQHEREKEVVLUIRSMAEGERARAASDVL 
CRSLANETHQLRRTLTATAHMCQHLAKCLDERQHAQRNVGERSP 
DQSBHTDGHTS VQS VI EKLQEENRLLKOKVTHVPn r .xra Kwnv vxt 

ASRDEYVRGLHAQLRGLQIPHEPELMRKEISRLNRQLEEKINDC 
A2VKQE LAAS RTARDAALE R VQMLE Q Q I LAYKDD FMS ERADRE R 

AQSRIQELBEKVASLLHQVSWRQDSREPDAGRIHAGSKTAKYLA 
ADALELMVPGGWRPGTGSQQPEPPAEGGHPGAAQRGQGDLQCPH 
CLQCFSDEQGEEliLRHVAECCQ 




26 


441 


GGIHPSPTEAi'KAQHLTMDCTWRlLFLVAAATGTHAQVQLLQSG 
SEVKKPGASVMVSCYVSGYTLTKLSMHWVRQAPGKGLB*MGPFD 
LQDVET I YPQKFQGRVSMTEETSTETTQ/AYIiELSS LRSEDTAV 
HHCATDTV 


5882 


2407 


221S 

: 

l 


SGCVEMLYSHSLEYNPEWISVQSAVAPAQLALNSDGDL*LHSGE 

rtrrd*qlpsaggpglqeplqlgelditsdefildevdg\vdlr 

HYSKQVE LELQQ IEQKS I RD Y I QESENIASLHNQI TACDAVLER 
MEQMLGAFQSDLSSISSEIRTLQEQSGAMNIRLRNRQAVRGKLG 
ELVDGLWPSALVTAILEAPVTEPRFLEQLQELDAKAAAVREQE 
^GTAACADVKGVLDRLRVKAVTKIRBFILQKIYSFRKPMTNYQ 
I PQTALLKYRFFYQFLLGNERATAKEIRDEYVETLSKIYLSYYR 
5 YLGRLMKVQ YEEVAEKDDLMGVEDTAKKGFFSKPSLRSRNTI F 
riiGTRGS VI S PTBLEAP I LVPHTAQRGEQR YPFEALFRSQH YAL 
[iDNSCREYLFICEFFWSGPAAK^LFHAVMGRTLSMTLKHLDSY 
^ADCYDAI AVFLCIHI VLR FRNIAAKRDVPALDR YWEQ VLALL W 
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SEQ 
ID 
NO: 



Predicted 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Ammo acid segment containing signal peptide" 
<A=Alanane, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, GoGlycine, 
H»Histidine, I=Isoleucine, K=Lyeine, 
L=Leucine, Methionine, NT-Aeparagine, 
P=Proline, QoGlutamine, R-Arginine, 
S -Serine, ^Threonine, V= Valine, 
W«Tryptophan, Y-Tyrosine, X=Unknovn, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 



PR FE L I IjEMN vus VRSTD PQ RIXjGIiDTRPHY I TRRYAEFSS Al> V ~ 
SINQTI PNERTMQLLGQLQVEVENFVLRVAAEFSSRKEQLVPL1 
NNYDMMLGVLM\E* ERAADDSKEVBS FQQLLNARTQEFI EELLS 
PPFGGLVAFVKEABALIBRGQABRLRGEEIARVTQLIRGFGSSWK 
S S VES LSQDVMRS FTNFRNGTS 1 1 QGALTQL IQ \ LYHRFHR V\L 
SQPQLRALPARAELINIHHLMVBLKKHKPNF 



5884 



4261 



-2^2" 



EFPGRRFRAVlvlEAGAGAGAGAAGWSCP GPGPTVTTLGSYEASEG 
CERKKGQRWGSLERRGMOAMEGEVLLPALYEEEEEEEEEEEEVE 
EEEEQVQKGGS VGS LS VNKHRGLSLTETE LEELRAQ VLQL VAEL 
EETRELAGQHEDDS LELQGhhEDERLASAQQAEVFTKQXQQLQG 
ELRSLREEISLLEHEKESELKEIEQELHLAQAEIQSIiRQAAEDS 
ATEHESDIASLQBDLCRMQNELBDMERIRGDYEMEIASLRAEME 
MKSSEPSGSLGLSDYSGLQEELQBLRERYHFLNEEYRALQESNS 
SLTGQLADLESERTQRATERWLQSQTLSMTSAESQTSEMDFLEP 
DPEMQLLRQQLRDAEEQMHGMKNKCQEXCCELEELQHHRQVSEE 
EQRRLQR E LKCAONEVLRFQTSHS \SPS HPLPPIP PS S P CLL * A 
LWISALLWCWWAETSS 



5885" 



""900" 



-46T 



GVLARASARIjRVPLTGVRACAE^PEW3A^PAKVAGAAEPDEDGGR 
SRLRDCGDYTPSERLGPKGAMIjWFQGAI paaiatakrsgavfv V 
FVAGDDEQSTQMAASWEDDXVTEASSNSFVAIKIDTKSEACLOF 

sqiypwcvpssffigdsgipleviagsvsadelvtrihkvrqm 
hllksetsvangsqsessvstpsasfepnntcensqsrnaelce 
ipstsdtksdtatggesaghatssqepsgcsdqrpaedlnirve 

RLTKKriEBRREEKRKEEEQRElKKEIERRKTGKEMLDYKRKQEE 
BLTKRMLEERlTOBXAEDRAAREJlIKOX3IALDRAERAARFAICrKE 

bveaakaaalijucoabmevkresyarerstyariqfrlpdgssf 
tnqfpsdapleeiarqfaaqtvgntygnfslatmfprreftkedy 

KKKLLDLEIAPSASWliIiP/ALFINF*AGRPTASIVHSSSGDIW 

tllgtvlypflaiwrlisnflfsnppptqtsvrvtsseppnpas 
ssksekrepvrkrvlekrgddfkkegkiyrlrtqddgedenntw 

NGNSTQQM 



1341 



AAGGGRRSRLSRStoPTGP3K SPSGVRCCG\RR\AWEDKDEFIiD V" 

iyhfrqiiavvxgviwgvlpi^gflgiagfclinagvlylyfsn 

YLQIDEEEYGGTWELTKEGPMTSFA/ IVHGHliDHl»LHCHPL*IiM 

vyssqvlpiqskgps 



5887" 



1937 



104 



PFRGRALTLKKQPRPGVAPPSLGTCHKSDPGkPAAOSOPPSPGS 
GTFGLLS FRMVRTKTWTLKKHFVG YPTNSD FELKTS EL PP LKNG 
EVLLE ALFLTVD PY^VAAIGILKBGDTMMGQQVAKVVE S KNVAI* 
PKGTIVIJ^PGWTTHSISDGKDLEKLLTEWPDTIPLSLALGTVG 
MPGLTAYFGLLEICGVKGGETVMVNAAAGAVGSVVGQIAKLKGC 
KWGAVG SDEKVAYXQKIiGFDVVFNYKTVESLEETIiKKAS PDG Y 
DCYFDNVGGEF3NTVIGQMKKFGRIAICGAISTYNRTGPLPPGP 
PPEIGIYQELRMEAFVVYRW0X3DARQKALECDLLKWVLELPYFVI 

D*LQANTLVYKSMKSAKPSLEYISEKLVSG\KIQYKEYIIEGFE 
NMPAAFMGMliKGDNLGiCr I VKA 



APGCRGCRATRCPCRGPRWDSLGDEAARSPAAPGGAPGLLGLRE"" 
RPDRCHPGGDDRGPQLHRGSPG/SPSELSRRPGPPGIi 0 GLQGPP 
PAPGLPQSRTL/PVLCVCDLSPAQCDINCCCDPDCSSVDFSVFS 
ACSVPVVTGDSQFCSQKAVI YSLNFTANPPORVFELVDQINPS I 
FCIHITN\ *NLHYPLLIQKYL/NENNFDTLMKTSDGFTLNAESY 
VS FTTKLD I PTAAKYE YGVPLQTSDSFLR FPSSLTSS LCTDNNP 
AAFLVNQAVKCTRKINLEQCE E I EALSMAFYSS PE I LRVPDS R K 
KVPITVQS I VIQSLNKTLTRREDTDVLQPTLVNAGHFS LCVNW 
LEVKYSLTYTDAGEVTKADLSFVLGTVSS VVVPLQQKFE IHFLQ 
ENTQPVPLSGNPGYVVGLPLAAGFQPHKGSGI IQTTNRYGQLTI 
LHSTTEQDCIiALEGVRTPVLFGYTMQSGCKLRLTGALPCQLVJaQ 
KVKS LItWGQG F PDYVAP FGNS QGP/ ADMLD W VP IH F ITQS FNRK 
DS CQLPGAIiVIEVKWTKYGS LLNPQAKI VNVTANL I S SS FPEAN 
SGNERTILISTAVTFVDVSAPAEAGFRAPPAINARLPFNFFFPF 
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SEQ~ 
ID 
NO; 


j Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid* 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Bo 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, Iolsoleucine, K=Lysine, 
L=Leucine, MoMethionine, N^Asparagine, 
P=Proline, OoGlutamine, R=Arginine, 
SsSerine, T=Threonine, V=Valine, 
W=Tryptophaxi, Y»Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, ' 
\=possible nucleotide insertion) 


5886 


375 


2302 


LLCRTPGVAMQkADSKQPSkRPR^DSPRTPSOTPSAEADWSPG" 
L ELHP DYKTWGPEQVCSFLRRGG PEEP VLLKN IRENE I TGAIiLP 
CLDES RFENLGVS SI/3BRKKLLS Y IQRLVQ I HVDTMKV IND P I H 
GHIELHPLIiVRIIDTPQFQRLRYXKQLGGGYYVFPGASHNRFEH 
S LG VG YLAGCLVHALGEKQ PELQ I S 2RD VL CVQ IAGLCHDLGHG 
PFSHMFDGRFIPLARPEVKWTHBQGSVMMFEHLINSNGI KPVME 
QYGLI PEED I CFIKEQI VGPLBSPVEDSLWPYKGRPENKS FLYE 
IVSNKRNGlDVDKWDYFARDCHHLGrQNNFDYKRFIKFARVCEV 
DNE LR I CAR D KE VGNLYDM FHTRNS LHRRAYQHKVGNI I DTMIT 
DAFLKADDYIE XTGAGGKKYRISTAIDDMEAYTKLTDNI FIiEIL 
YSTDPKIiKDAREILKQIEYRNLFKYVGSTQPTGQIKIKREDYES 
LPKEVASAKPKVLLDVKLKAEDFIVDVTNMDYGMQBKNPIDHVS 
FYCKTAPNRAIRITKNQVSQLLP \ E KFAEQ \ L IR VYCKKVD RKS 
LYA\ARQYFVQW\CADR\NFT\KPQDGRCY«PPTP*HPQKKGW\ 
NDSTFSPKIPTRLPRRLPKSRV\QLFXDDPM 


5883 


1831 


731 


LPAACGRPVTARPRQAPEGRSGRPRDIOPYPPQVFPPRPDRVAI 
VTGGTDGIGYSTAKHLARLGMHVI IAGNNDSKAKQ W3KIKEET 
LNDKET*VLLCCPGWLCLWNSSDPPTSASRGAGTTGVHHHFLLK 
FGIFIL\DIASMTSIRQFVQKFKMKKI?LHVLIN1WGVMMVPQR 
KTRDGFEEHFGIjNYLGHFLLTNLLIjDTIjKESGSPGHSARVVTVS 
SATHYV AE LNMDDLQS S A CYS PHAAYAQS KLALVL FTY HIiQRLL 

AAEGSHVTANVVDPGVVNTDLYKHVFWATRLAKKLl^LLFIOT 
DEGAWTS IYAAVTPELEGVGGRYL YNKKETKSLHVTYNQKLQQQ 
LWSKSCEMTGVLDVTL 


DO SU 

i 


1322 


200 


FRRGWS AAGRAVPVAFCSR I S ASS PRRPRGAVRLQSGTEAACRS 
GRP D P R PAS AAGGHAG ERM S Q RDTLVHL FAGGCGGTVGAI LT CP 
LEW KTRLQS S S VTI«Y I S EVQLNTMAG AS VNRWS PGPLH CLXV 
ILEKEGPRSLFRGLGPNLVGVAPSRAIYFAAYSNCKEKLNDVFD 
PDSTQVHMISAAMAGFTAITATNPIWLIKTRLQL* /SQGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFVIYESI 
KQKLLE YKTASTMEKDEESVKEASDFVGMMLAAATS K\LVATTI 
AYPHEWRTRLREEGTKYRS FFQTLSUUVQEEGYGSIiYRGLTTH 
LVRQIP\NTAlMMATYELVVYXiIjNG r 




1322 


200 


FRRGWS AAGRAVPVAFCSR I SAS S PRRPRGAVRLQSGTEAACRS 
GRPDPRPASAAGGHAGERMSQRDTLVHLFAGGCGGTVGAIT.TCP 
LEVVKTRIXJSSSVTIjYISBVQLNIWAOASVNRVVSPGPLHCLKV 
ILEKEG PRS LFRG LGPNLVGVAPSRAI YFAAYSNCKEKLNDVFD 
PDSTQVHMISAAMAGFTAITATNPIWLIKTRLQL*/SQGTAGKR 
RMGAFBCVRKVYQTDGLKGFYRGMSAS YAGI SETVIHFVI YES I 
KQKLLEYTCTASTMENDEESVKEASDFVGMMIiAAATSK\l,VATTI 
AYPHEVVRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
LVRQIP \NTAI MMATYELWYLLNG 


5892 


17^4 


379 


WLR VCGRLS VNS A VS S RTG G W S AGLTCAMQRLQ WLGH LRG PA 
DSGWMPQAAPOiSGAPHASAADVVVVHGRRTAICRAGRGGFKDT 
TPDELLSAVMTAVLKDVNLRPEQLGDICVGmrtiQPGAGAIMARI 
AQFLSDI PETVPLSTVNRQCSSGLQAVASTAGGIRNGS YDIGMA 
CGVESMS LADRGNPGNI TSRLMEKEKARDCL t PMG ITS ENVAER 
FGI SRE KQDTFALAS QQ KAARAQ S KGCFQAE I VPVTTTVHDDKG 
TKRSrrVTQDEGIRPSTTMEGLAKIiKPAFKKDGSTTAGNSSQVS 
DGAAAILLARRS KAEELGLP I bGVLRSYAVVGVPPD IMG IGPAY 
AI PVALQKAGLWSDVDI PEINE \AFASQAAYCVEKLRLPP * EG 
* TPLGGASGP * GHPLGLHWGHVQVI TLAQ * S * SARGKRAYRSGC 
PCAIGSWNGS PLPVFEYPWGT 


5893 


3 


16S3 


ILSKRRCQKAKTKELMAKKVAVIGAGVSGLISLKCCVDEGLEPT' ' 
CFERTED IGGVWRFKENVEDGRAS I YQSWTNTS KEMS C FSDF P 
MPEDFPNFLHNSKLLEYFRIFAKKFDLLKYIQFQTTVLSVRKCP 
DFSSSGQWKVVTQSNGKEQSAVFDAVMVCSGHHILPHIPLKSFP 
GMERFKGQYFHSRQYKHPDGFEGKRILVIGMGNLGSDIAVELSK 
NAAQVFISTRHGTWVMSRISETOYPWDSVFHTRFRSMLRNVLPR 
TAVKWMI BQQMNRWFNHENYGLBPQNKYIMKEP VLNDDVPSRLIj 
CGAr XVKSTVKELTETSAI FEDGTVEENIDVI I FATGYS FS FPF 



395 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


c»v<.j.w ac^ment concaimng signal peptide 
{A«Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, X^Lysine, 
LaLeucine, M=Methionine, N=Asparagine , 
PaProline, Q=Glut amine - R=a-rm' n i n<=> 
S^Serine, T»Threonine, V=Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LE DSLVKVENNMVS LYKYI FP AHLDKSTLAC IGL I Q PLGS I F PT 
AELQARWVTRVFKGLCSLPSERTMMMDI IKRNEKRIDLFGESQS 
QTLQTNYVDYLDBIJ^EIGAKPDFCSLLFKDPKIAVRLYFX3PCN 
S Y* YRLVGPGQWEGARNAIFTQKQRILKPLKTRALKDS SNFS VS 
FLLKI LGLLAVWAFF \ CQLQWS 


5B94 


174 


1573 


RYSPKKVLQNKBSSLKLGMAXALVSAHSLAPLNLKKEGLRVVRE 
DKTSTWBQGFKLQGNSKGLGQEPLCKQFRQLRYEETTGPREALS 
K UKCiA-UUWijg PBTHTKEHILELLVLEQFL 1 1 L PKELQARVQEH 
HPESREDWWLEDLQLDLGETGOQVDPDQPXKQKILVEEMAPL 
KGVQEQQVRHECEVTKPEKEKGEETRIENGKLrWTDSCGRVBS 
SGKI S EPME AHNEG SNLERHQAKPKE KI E YKCS EREQR F I QHLD 
L I EHASTHTGKKLCBS D VCQSSS LTGHKKVLS * ERKVIQ C\HGV 
LGKAFQRSSHLVRHQKIHLGEKPYQCNECGKVFSQNAGLLEHIiR 
IHTGEKPYliClHCXJKNFRRSSHLNRHQRIHSQEEPCBCKECGKT 
FSQALLLTHHQRIHSHSKSHQCNECGKAFSLTSDL1RHHRIHTG 
EKPFKCNICQKAFRLNSHLAQHVRIHNEEKPYQCSECGEAFRQR 
SGLFQHQRYHHKDKLA 


• 


2967 

•r . 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGB " 

KRLFVSDGVPGOjPVIJU^AGRARGRAEVLIS^VGPEDCVVPFLT 

RPKVPVLQLDSGNYLFST3AICRYFF\LiLSGWEQDDLTNQWLEW 

EATELQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 

RQ\NCPFLAGETESLADIVLWGALYPLLQI>PAYLPEBLSALHSW 

FQTZ*STQ\EPCQR\AARRLVLKQ\QGVIiALR\ pylqkqpqpspa 

EGKGLS P I EP EE EE LATLS EEE I AMAVTAWE KGLES LP PLRPQQ 

NPVLPVAGERNVL ITSALP YVNNVPHLGN I IGCVLSADVFARYS 

RLRQ1WTLYLCGTDEYGTATBTKAL\EEGLTPQEICDKYHIIHA 

□IY\RWFNISFDIFGRTTTPOQ\TKIT\QDIFQQLLKRGFVLQD 

TVEQLRCEHCARF\LADRFVEaVCPFCGYEEARGDQCDKCGKLI 

NAVELKKPQCKVCRSCPVVQSSQHLFLDLPKLEKRLEEWLGRTL 

PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 

GFEDK\ VFYVWFDAT IG YLS I TANYTDQ WERWW \ KNPEQ VDL YQ 

FM \ AKDNVPFHS LVFPS S ALGAE DNYTL \ VSHL I ATEYLN YEDG 

K\FSKSRGVGVFRDm\AHDTGIPPDISRFYL\IjYIRPEGK\DSA 

fswtdlllknns\ellnnlgnfxnra\gmfvskffgg\yvpemv 

LTPDDQRIiIA\>IVTLEIiQnYlIQ\LLEKVRIRDAljRS ILTI S \RH 
GNQYI \ QVNE PW\ KR I KGSEADRQRAGTVTGLAVNI AALLS VML 
QPYMPTVSATIQAQLQLPPPACSILLTNFLCTLPAGHQIGTVSP 
LFQKLEJJDQIESLRQRFGGGQAKTSPKPAVVBTVTTAKPQQIQA 
LMDEVTKQGNIVRELKAQKADKNEVAASVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


SB 9 6"' 


29£1 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGB 
MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCVVPFLT 
RPKVP VLQLDS GNYLFS TSAI CRY FF\LLSGWBQDDLTNQWLEW 
EATELQPTLSAALYYL\VVQGKKG\EDVLGSVRRTLTHrDHSLS 
RQ\NCP FLAGS TESLAD rVLWGALYPLLQDPAYLPEELSALHS W 
FQTLSTQ\EPCQR\AARRLVLKQ\QGVLALR\PYLQKQPQPSPA 
EGKGLSPIEPEEEELATLSEEEIAMAVrAWEKGLESLPPLRPQQ 
NPVLPVAGERNVLITSALPYVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTLYLCGTDBYGTATETKAL\EEGLTPQEICDKYHI I HA 
DIY \ RW FNISFDI FGRTTTPQQ\TKI T\ QDIFQQLLKRGFVLQD 
TVEQLRCEHCARF\ LADRFVEGVCP FCG YEEARGDQCDXCG KL I 
NAVBLKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLEEWLGRTL 
PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 
GFEDK\VFYVMFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 
FM\ AKDNVPFHSLVFPSSALGAEDNYTL \VSHLIATE YLNYEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FSWTDLLLKlWS\EIIjNNIiGNFIl^\GMFVSKFPGG\YVPEr'lV 
LTPDDQRLLA\HVTLELQHYHQ\LLEKVRIRDALRSILTIS\RH 
GNQYI \QVNEPW\KRI KGSEADRQRAGTVTGLAVNIAALLS VML 
QPYMPTVSATIQAQLQLPPPACSILLTNFLCTLPAGHQIGTVSP 
LFQKLBNDQI ESLRQRFGGGQAKTS PKPAWETVTTAKPQQ I QA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D^sAspartic Acid, B= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
HaHistidine, I=Isoleucine, K=Lysine, 
T.aLeucine, MaMethionine, N=Asparagine, 
■ Po Proline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine, V»Valine, 
W-Tryptophan, Y-Tyroeine, X»Unknown, +=»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMDEVTKQGNIVRELK^QKADKNEVAAEVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5897 


29*7 


86 


HPSLLGAXPFYPPPSSPWPPPLYLPWNSHRKSRHFINQRGIHGB 
MRLPVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCWPFLT 
RPKVFVLQLDSGNYL PSTSAI CRYFF\LLSGWEQDDIiTNQWIiEW 
EATELQPTLSAALYYL\ WQGKKG \ EDVLG S VRRTLTHI DHS LS 
RQ\NCPFLAGETBSIiADIVLWGAIiYPLLQDPAYLPEE1>SALHSW 
FQTLSTQ\BPC3QR\AARRLVLKQ\QGVLALR\PYLQKQPQPSPA 

egkglspiepeebelatlseeeiamavtawexgleslpplrpqq 
np vlpvage rnvl i tsalpyvnnvphlgni igcvlsadvfarys 
rlrqwntlylo3tdeygtatetkal\eegltpqeicdky:-iiiiia 
diy\rwfntsfdifgrtttpqq\tkit\qdifqqllkrgfvlqd 
tveqlrcehcarf\ladrfvegvcpfcgyeeargdqcdkcgkli 
navelkkpqckvcrscpvvqssqhlfiidlpklekrleewusrtl 

PGSDWTPNAQFITPFFGFREW PS KPRWQ* TRDIjK\WGNPGTP * K 

fm\akdnvpfhslvfpssalgaednytl\vshliateylnybdg 
k\fsksrgvgvfrdm\ahdtgippdisrfyl\lyirpegk\dsa 
fswtdlllknns \ellnnlgnfinra\gmfvs kffgg \ yvpemv 
ltpddqrlla\hvtlelqhyhq\llekvrirdalrsiltis\rh 
gnqyi \qvnepw\krikgseadrqragtvtgiiavniaallsvml 
q p ymptvs at iqaqlqlp ppacs i lltn flctlpaghqigtvs p 

IiFQKtiENDQlESIiRQRFGGGQAKTSPKPAWETVTTAKPQQIQA 
LMD3VTKQGNI VRELKAQKAD KNE VAAEVAKt iLDL K KQLAVAEG 
KPPEAPKGKKKK 


5898 


2967 


86 


HPS uLGAI P FYPPPS S P W PP PL YL FWNSHRKS RHFINQRGIHGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCVVPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
EATELQPTL3AALYYL \ WQG KKG \ EDVLGS VRRTLTHI DHSLS 
RQ\NCPFIAGETBSLADI VIiWGALYPLLQD PAYLPEBLSALHSW 
FQTLSTQ\EPCQR\AARRLVLKQ\QGVI»ALR\PYLQKQPQPSPA 
EGKGLSP I EPEEEEIJVTLSEEEIAMA VTAWEKGLESLPPLRPQQ 
NPVLPVAGERNVLITSALPYVNKVPHLGNI IGCVLSADVFARYS 
RLRQWNTLYLCGTDEYGTATETKAL\EEGLTPQEICDKYHIIHA 
D1Y\RWFNISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVLQD 
TVEQI^CEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCX3KLI 
NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLBEWLGRTL 
PGSDWTPNAQFITPFFGFRBWPSKPRWQ*TRDLK\WGNPGTP*B 
GFEDK\VFYVW FDATIG YLSI TAN YTDQWERWW\ KNPEQVDLYQ 

fm \akdnvpfhslvfpssalgaednytl\vshliateylnyedg 
k\ fsksrgvgvfrdm\ahdtg i p pdisrfyl\lyirpegk\dsa 
fswtdlllknns\elijwlgnfinra\gmfvskffgg\yvpemv 

LTPDDQRIiLA\HVTLBLQHYHQ\ LLEKVR IRDALRS I LTI S \RH 
GNQYI \QVNEPW\KRIXGSEADRQRAGTVTGLAVNIAALLSVML 
QP YMPTVS ATI QAQLQLP PPACS ILLTNFLCTLPAGEQIGTVSP 
LFQKLENDQ I ESLRQRFGGGQAKTS pkpawetvttakpqoiqa 

LNDE VTKQGNX VRELKAQKAO KWEVAABVAJCLLDLKKQLA VAEG 
KPPEAPKGKKKK 


5899 


325 


1078 


NCP KS KE PNGVRAP slp s p LRAAMALS dvdvkkq ikhmmafi eq 
EANEKAEBIDAKAEEEFNIEKGRLVQTQRLKIMEYYEKKEKQIE 
QQKKILMSTMRNQARLKVLRARNDLISDLLSEAKLRLSRIVEDP 

evyqgllidklvlqgllrllepvmivrcrp\qdlllveaavqkai 
peymtisqkhvbv\qidkea*lavecswevvibvysgnqrikvsn 
tlesrldlsakqkmpeirmalfgantnrkffi ! 


5900 


64 


1409 


kaasrdspclefcplcgvsshdlqhrmwyhrlshlhsrlqdllk 
ggviypalpqpnfksllplavhwhrtasksltcawqqhedhfel 
kyantvmrfdyvwlrdhcr9ascynskthqrsldtasvdlcikp 
ktirldbttlfftwpdghvtkydlnwlvknsyegqkqkviqpri 

LWNAE IYQQAQVPSVDCQS FLETNEGLKKFLQNFLLYGIAFVEN 

vpptqehteklaerisliretiygrmwyftsdfsrgdtaytkla 
ldrhtdtty fqe pcg i qvfhclkhegtggrtll vdgfyaaeq vl 
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NO: 


beginning 
nucleotide 
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to first 
amino acid 
residue of 
amino acid 
sequence 


rlBUlClCU CZ1U 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C»Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 

u^ucuLiae, n— nrsr.tiionine, rJsASpatay xns , 

P=Proline, Q=Glutamine, R=Arginine, 
S»Serine, T-Threonine, VoValine, 
W~Tryptophan, Y-Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QKAPBEPELLSKSAI \KHEYIEDVGECHQPHDWDWAQS* ISTHG 
/ YKE LY L I R YNNYDRAV INTVP YWVHRWYTAHRTLT I ELRRP E 
NEPWVKLKPGRVLPlDNWRVLHGRECFTGYRQLCGCYLTRDDVIi 
NTARLLGLQA 


5901 




2121 


VAI EQTS LKMMQAVGGAP ARP TGEY I CNQCGAKYTS IiDSFQTIUj " 

kthldtvl? kltcpqcnkefpnqesllkhvtihfmitstyyi ce 

scdkqptsvddlqkhlldmhtfvffrctlcqevfdskvsiqlhl 

Xavio^nekkvyrctscnwdfrnetdi^lhvkhnhlenOgk^ 

ci pcgbsfgtevelqchi tths kkynckfcs kapha! illekhl 

rekhcvfetktpncgt>rgaseqvqkeevelqtlltnsqeshnsh 

dgseedvdtsepmygcdicgaaytmetllqnhqlrdhnirpges 

ai vkkkael i kgnykcnvcs rtffsenglrehmqthlg pvkkym 

CPICOERPPSLLTLTEHKVTHSKSIiDTGNC^ICKMPIjQSEEBFL 
BHCQMKPDIjRNSLTGFRCVVCMQTVTSTLELfaHGTFHMQKTGN 
GSAVQTTGRGQHVQKLYKCASCLKEFRSKQDLVKLDINGJLiPYGL 
CAGCVKLS KSAS PGINVPPGTNRPGLGQNEMLS A I EGKGKVGG L 
KTRCS * LATFKF* VLKVELPE PHPKPPHRGVSRPDSNSTQLKTP 
QVS PMPRIS PSQSDEKKT YQCI KCQMVFYNEWDIQVHVANHKID 
EGLNHECKLCSQTPDSPAKLQCHLIBHSPEGMGGTFKCPVCFTV 
FVQANKLQQH I FSAHGQEDK I YDCTQC PQKFF FQTELQNHTWTQ 
HSS 


" S902 


712 


209 


IiKNRRRSRPS I RQS I GS TS VS RWLTS L FT YLDHTADVQ * V* REF 
1PLXPRQ* ED *MFQSWLHAWGDTLEEAFEQCAMAMFOYMTDTGT 
VEPLQTVEVETQGDDLQSLLFHFLDEWI*YKPSADEFPIP\GWGE 
EPSLSKHPQGTEVKAITYSAMQVYNEENPEVFVIIDI 


5903 

• 


2106 


735 


DTPGPSLPSTTAPPSLRSLSPPSRPSYLLPGDPQPLQGRGLPTT 
P AliFALSAVPGGAAS PM P PSGLRLLPLLL PLLVfLLVLT PGRPAA 
GLSTCKTI DMELVKRKRIEAIRGQI LSKLRLASPPSQGEVPPGP 
LPEAVLALYNS TRDRVAGESAEPEPE P EAD YYAKEVTR VLMVET 
HNE I YDKFKQS THS I YMFFNTS ELREAVPE P VLLSRAELRLLRL 
KLKVEQHVELYQKYSNNSWRYLSNRLLAPSDSPEWLSFDVTGW 
RQWLSRGGEIEGFRLSAHCSCDSRDNTLQVDINGFTTGR\RGDL 
ATIHGMNRPFLI*Ij^TPLERAQHLQS\SRHRQ^\DTNY\CFSF 
HGGRNCLRC/ VHC*HLI FRKDL\GW\ KWI \HE \ P KGYHANFC\L 
GPCP Y I WS LDTQ YSKVLAJb YNQ \HK PG\ A3 AAP\ CCVPQALEP \ 
LPIVYY\VGRKPKVEQIjSNMIVRSCKCS 


5904 


3 


1126 


MMBEIENAINTFKE1EQRLIYEELI1CEEKTTNNELSAISRKIDTW- 
ALGNSETEKAFRAISSKVPVDKVTPSTLPEEVLDFEKPLQQTGG 
RQGAWDDYDHQNFVKVRNKHKGKPTFMEEVLEHLPGKTQDEVQQ 
HEKWYQKFLALEERKKESIQIWKTKKQQKREEIFKLKEKADNTP 
VLFHNKQEDNQKQKEEQRKKQKLAVEAWKKQKS IEMSMKCASQL 
KBEEEKEKKHQKERQRQFKLKLLLESYTQQKKEQEEFLRLEKEI 
REKAEKAEKRKNAADE I SRFQERDLHKLELK1 LDRQAKBDE KSQ 
KQRRLAKLXEKVENNVS RD PSRL Y/ NTHQRLGRTNQKDRTNRLW 
ATSTYPT*GYSNLETRNTEKSMR 


5905 


287 


2912 


MAS FPPRVNEKE TVRLRT IG ELLA PAAPFDKKCGRENWTVAFAP " 
DGSYFAWSQGHRTVKLVPWSQCLQNFLLHGTKNVTNSSSLRLPR 
QNSDGGQKNKPREHIIDCGDIVWSLAFGSSVPBKQSRCVNIEWH 
RFRFGQDQLLLATGLNSGRI KI WD VYTG KLLLNLVDHTG WRDL 
TPAPDGSLILVSASRDKTLRVWDLRDDGN\MMKVLRGHQNI'JVY\ 
SCAFSPDSSMLCSVGASKAWAAILV*LRIOfHHSHT3ATMVLS 
W AE RVAS LATGLGATFTIG* SWLAFVLOG VLYVHRCWSM ^TFP P 
SPFLFFFFKVISPTVKYH*LLSKLIFQFYGIGSLTSBTNLM*SI 
WLSNGFSVLPFGILSDSRDI LRL* FNLKFVLI FF * K* CXVSVQK 
KKKPKRIALLQEERLS*DKPPSSHLI*QTEVNIRILFRAILHS* 
LLIFRI *NCI *TYS * IIDPFYIQMTYDRG* FGKNKMVKF+FIEM 
* L YYFHKI AFS FCNW*HPCCLPKKFHLAVNI LFACS ICFS S * A 
QVGDPSLIj*TSDYLKGRCQWSNNLLTLRFLSVYFFKNLVVSGKK 
REGGL* YLTLFISVYFS * LVFGINGPQYS FWKLHCLYFMFRL I 
FKLTFNRNI+NRICMSALINLKTDFNLTMTLSIFFKLLI IYNA* 
YNLN* I + Q P* YKMCHFVL»CMS E * S YNIC1»FIAGF \ LWNMDKYTM | 
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SEQ 
ID 
NO: 


1 Predicted ' 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HaHistidine, I«Isoleucine, K=Lysine, 
L»Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R«aArginine, 
S=Serine, T= Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=tJnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\-possxble nucleotide insertion) 








. RKLEGHHHDWACDFS PDGALLATAS YDTRVYI WD PHNGD I LM 
EFGHLFPPPTPIFAGGANDRWVRSVSFSHDGLHVASIADDKMVR 
FWR I DEDY P VQVAPLSKGLCCAF STDG SVLAAGTHDGS V YFWAT 
PRQVPSLQHLCRMSIRRVMPTQEVQBLPIPSKIiLEFLSYRI 


5906 


146 


2038 


RKGAGSGRMASGA\YNPYIEIIEQPRQRGMRFRYKCE(jJt§AGSl 
PGEHSTDNNRTYPSIQIMNYYGKGKV\RITLVTK\NDPYKPHPH 
DLVG KDCRD \G YYEAE FGQE\ RRP \ I*FFQN\LGIRCVKKKEVKE 
A\IITR\IKAGINPFDVP*KQLKDIEDCDLDWRLWPRVFLPDG 
HGNI»\ TTALPPV\ VSS PI YDNRAPNTAELR VCR VNKNCGS VRGG 
DE IFLLCDKVQKDDIEVRFVLNDWBAKGI FSQADVHRQVAI VFK 
TPPYCKAITEPVTVKMQLRRPSDQEVSESMDFRYLPDEKDTYGN 
KAKKQKTTLLFQKLCQDHVETGFRHVDQDGLELLTSGDPPTLAS 
QSAGITVNFPERPRPGLLGSIGEGRYFKKEPNLFSHDAWREMP 
TGVSSQAESYYPSPGPISSGLSHHASMAPLPSSSWSSVAHPTPR 
SGNTNPLSSFSTRTLPSNSQGIPPFLRIPVGNDLNASNACIYNN 
ADDIVGMEASSMPSADLYGISDPNMLSNCSVNMMTTSSDSMGET 
DNPRLLSMNLENPSCNS VLDPRDLRQLHQMS SS S MSAGANSNTT 
VFVSQSDAFEGSDFSCADNSMINESGPSNSTNPNSHVFVQDSQY 
SGIGSMQNEQLSDSFPYEFFOV 


5907 


99 


1873 


TYIiLSSWSS * * nLdt^ 2 ksqvkWJ ttkfl WJgc i s <<?p y pqpakqngk 

KATSKVPSAPHFVHPNDHANREAELKKKWVBEMREKQQAAREQE 
RQKRRTI ES YCQDVLRRQEE FEHKEEVLQEIiNM F PQLDDEATRK 
AYYKEFRKVVEYSD^LEVU>ARDPI^CRCFQMEEAVIjRAQGNK 
KLVLVLNKIDLVP KEWEKWLDYLRN B LPTVAFKASTQHQVKNL 
NRCSVPVDQASESLLKS KACFGAEWLMRVLGNYCRLGEVRTHIR 
VGWGLPNVGKSSLINSLKRSRAC5VGAVPGITKFMQEVYLDKF 
IRLLDAPGIVPGPNSEVGTII^CVHVQKLADPVTPVETILQRC 
NLEEISNYYGVSGFQTTEHFLTAVAHRLGKKKKGGLYSQEQAAK 
AVLAD WVSGKI S FYI PPPATHTL PTHLS AE 1 V KEMTEVFD I E DT 
EQANEPTMECLATGESDELLGDTDPLEMEI KLLHSPMTKIADAI 
ENKTTVYKIGDLTGYCTNPNRHQMGWAKRNVDHRPKSNSrm>VC 
SVDRRSVLQRIMETDPI^GQAIjASAIKNKXKMQKRADKIASKL 
SDSMH3ALDLSGNADDGVGD «• 


5908 


247 


975 ■ " 


HCGIKKRGEGSGSPSPASGGFQUGCQIP3PSLPSEEETttPkTRA 
HTRTLRATLTRRPPRSHSTRLRFPMPLDGDGGLASWK/ pmrer* 
GWRR PAKAAGAS LGVAATGKRGCRMS KRYLQKATKGXLLI 1 1 FI 
VTLWGKN^SANHHKAHHVKTGTCSVVALHRCC^^ QT 
VKCS CFPGQVAGTTRAAPSCVDAS I VEQKWWCHMQPCLEGEECK 
VLP DRKGKSCS S GNKVKTTRVTH 


5909 


1 


5002 


PAI PG S TX I WAPGSHSAARADGRHGS I* PS OSQAPGALCGARAP P 
SSNL-RADRSMICAQARAGKNLYHNRFLGLAAMAFPSRNSQSLRR 
CKEPIRYSYNPDQFHNMDLRGGPHDGVTIPRSTSDTDLVTSDSR 
STU4GRSSYYS IGHSQDLVIHWDIKEEVDAGDWIGMYL I DEVLS 
ENFLDYKNRGVNGSHRGQI I W KI DAS S YFVEPETKI CFKYYHGV 
SGAIJIATTPSVTVKNSAAPIFKSIGADETVQGQGSRRIiISFSIjS 
DFQAMGLKKGMFFNPDP YLKIS IQPGKHS I FPALPHHGQERRSK 
IIGNTVNPIWQAEQFSFVSLPTDVLEIEVKDKFAKSRPIIKRFL 
GIQ^PVQRLIjBRHAIGDRVVSYTLGRRLPTDHVSGQLQFRFEI 
TSSIHPDDEEI5LSTEPESAQIQDSPMNNLMESGSGEPRSEAPE 
SSESWKPEQLGEGSVPDRPGNQS I ELSRPAEEAAVITEAGDQGM 
VS VGPEGAGELLAQVQKD I QPAPS AEELAEQLD LGE E AS AIJLL E 
DGEAPASTKEEPLEEEATTQSRAGREEBEKBQEBEGDVSTLBQG 
EGRLQLRAS VKR KSR P CSLP VSEL ETVI ASACGDPETPRTHYI R 
1HTLLHSMPSAQGGSAAEEEDGAEBESTLKDSSEKDGLSEVDTV 
AADPSALEEDR EEPEGATPGTAHPGHSGGHFPS LANGAAQDGDT 
HPSTGSESDSSPRQGGDHSCEGCDASCCSPSCYSSSCYSTSCYS 
SSCYSASCYSPSCYNGNRFASHTRFSSVDSAKISESTVFSSQDD 
EEEENSAFESVPDSMQSPEIJ3PESTNGAGPWQDELAAPSGHVER 
SPEGLESPVAGPSNRREGECPILHNSQPVSQLPSLRPEHHHYPT 
IDEPLPPNWEARIDSHGRVFYVDHVNRTTTWQRPTAAATPDGMR 
RSGSIQQMEQLNRRYQNIQRTIATERSEEDSGSQSCEQAPAGGG 
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Glutamic Acid, F= Phenylalanine, G=Glycine, 
HoHistidine, I«Isoleucine, K»Lysine, 
L-Leucine, M»Methionine, N=Asparagine , 
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" 6910 






GGGGSDSBABSSQSSLDLRREGSLS PVNSQKITIiZjLQS PAVKFl" 

TNPEFFTVLHANYSAYRVFTSSTCLKHMlLXVRRDARNFERirQH 

NRDLVNFINMFADTRLELPRGWEIICrDQQGK5FFVDHNSRArTF 

IDPRI PliQNGRLPNHLTHRQHLQRLRSySAGEASEVSRNRGASL 

LAR PGH5 LVAAI RSQHQHES LPLAYNDKTVAFLRQ PNI FEMLQE 

RQPSIiARNHTLREKIHYIRTEGNHGLEKLSCDADLVILLSIiFEB 

EIMSYVPLQAAFHPGYSFSPRCSPCSSPQNSPGLQRASARAPSP 

YRRDFF*AKLRNFYRKI<EAKGFGQGPGKIKLIIRRDHI«LEGTFNQ 

VMAYSRKELQRNKLYVTFVGEEGLDYSGPSRBFFFLLSQELFNP 

YYGLPE YS ANDT YTVQIS PMSAF VENHLEWFRFS GRILG \ LAL I 

HQYLLDAFFT\RPFYKALL\RliPC\D\LSDLEYLDl£EFHQSLQW 

MKDNNITDILDLTFTVNEEVTGQVTERELKSGGANTQVTEKNKK 

EYIBRJWKWRVERGWGQTEALVRGFYEVVDSRLVSVFDARELE 

LVI ACTA E I DLNDWRNNTB YRGGYHDGHLVIRWFWAAVERPNNE 

QRLRLLQ FVTGTS S VPYEG FAAP P WE PMGLRRPLP * KKWGKI TS 

LPPRG\HTCLQPDWDLPTVSPRTPMLYEK\LLTA\VEBTSTFGT 


5911 


1526 


446 


VAE FAAM E PQRTQI KLDPR YTADLLB VliKTN YG I P S ACFSQ PPT 
AAQLLRALGP VELALTS ILTLLALGS I AI FLBDAVYL YKNTLCP 
I KRRTLLWKSSAPTWSVLCCFQIiWI PRSLVLVEMTITSFYAVC 
FYLLMIjVMVEGFGGKEAVLRTLRDTPMMVHTGPCCCCCPCCPRL 
LLTRKKLQ\R*CWALSNTPS*R*R*PWWACFSSPTASMTQQTFL 
RGAQLYGSTLSSA/CSTLLALWTLGI ISRQARLHLGEQNMGAKF 
ALFQVLL ILTALQPSI FSVLANGGQIACS PPYS SKTRSQVMNCH 
LLILBTFI^rVLTRT^TYRJRKDIIKVGYETFSSPDLDLNIjKALRWM 

AYftVl JUj Uv- in 




109 


595 


QuPIAPCIQGKGLEMRSPKPQSFIIRSSHSGAGLLVKNPSTPVF- 
CGHRRGGAAFKYKPTP WGPBQRPTGOKHMRGGVSLLS PRLECS 
GTISAHCNLRLPSSSNSPAPAS*LAGITGVCHHAQL1FVFLVET 
GFHHVGQAGLELL/NWIHLPRPPKVLGLQA 


5912 


924 


277 


MILNKAIjMIjGALALTTVMS PCGGEDI VADHVAS YGVNLYQS YG P 
SGQYSHBFDGDEEFYVlJIiE^KETVWQLPIiFRRFRRFDPOFALTN 
IAVLKHNLHIVIKRSNSTAATNEVPEVTVFSKSPVTLGQPNTLI 
CLVDNIFPPWNITWLSNGHSVTEGVSETRPSSPKSDHFrjbQDQ 
VTSPSFPFE* * DL * TAKVEQ LGAWFE P LLKHWGAE I PTTL 


5913 


46 


1198 


QLRMAGAEGAAIiKQSEI^PVVsLVDVLESDEEI^NEACAVLGGS 
DSEKCSYSQGSVKRQALYACSTCTPEGEEPAGICLACSYECHGS 
HKLFELYTKMFRCDCGNSKFKNLECKLLPDKAKVNSGNKYNDN 
FFGLYCI CKRPYPDPEDEZ PDEMIQC WCEDW FHGRHLGAI P PE 
SGDFQEKVCQACMKRCS FLWAYAAQLAVTKIS T\GMMD WCGTLM 
E* /DDQEVIKPENGEHQDSTLKEDVPEQGKDDVREVKVEQMSEP 
CAGSSSESDLQTVFKNESLNAES1CSGCKLQELKAKQLIKKDTAT 
YWPLN WRS KLCTCQDCMKMYGDLD VLFLTDEYDTVIiAYEN KGK I 
AQATDRSDPLMDTLSSMNRVQQVELIC/GIQ* FED 


5914 
1 ^Qi 


560 


124 


NLGGSELPPEKALiFIQVASMNQRRVDFYIjASIEDMIiVAI /GGRN 
E^IGALSSVETYSPKTDSWSYVAGIjPRFTYGHAGTIYKDFVYISG 
GHDYQIGPYRKNLLCYDHRTDVVJEERRPMTTARGWHSMCSLGDS 
j. i a j.\3\aauua i tibPUfiRFD vLiGVEAi 5 PQCNQWTRVAPLLKANSE 
SGVAVWEGRI YI LGGYS WENTAFSKTVQVYDREADKWSRGVDLP 

KAIAGGSACFIAP*SLGQRTRKRKAKARGTRTGASDPSCASWDH 
PHRHLPGLCRPAATS 




1604 


703 


FPGRPTRPLKLGRRRKRARI IQAPHCIISPRPRTCPPGALQAPEA 
P ASRAEGPVAVWNGHTEGP APARSAPKEP PGLPRPLG S FP CPT 
PQEDFPAi^GPCPPRMPPSPGFSAWLLKGTPPPPPPGLVPPIS 
KPPPGFSGLI,PSPHP\PVSPAPPPPPPQK/RPRLLPAP/PGLPS 
PRELPGEEPSAHPVHQGLPAERRGPIjQRVQEPLRGVQTGPDLRS 
PVLQELPGPAGGEFPBGL* +AAGPAAH 


. 5916 
*917 


256 
134 3 


633 " 

827 ; 


3PRNWEIWGPWHRWESFSLEGEMPSRIPEPSPDSTi53TSGKGCK 
rVTGAVHRHLNHVAGI IPWVLHSQLKPTAATAQDQWTSQQYPDH 
=TRLI LQ+NQATADKNN* TTALLQPHQRL\ VS PRMAEA 






^HQILTYLEP/ICLWKYMKIIiTVFLTKSVLEi*KFIHTPQTYR 
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?*NDFFGIKEVyvSRRLRKTSP/kiAVTPLBQAVVSKECVPVDQ "j 
FMEHLL PS LLSLAS D PVPNVRVLLAKALRQMIjLB KAY PRNAGNP 
HLEVIEET ILALQSDRDQDVS PPAALB PKRRNI IDTAVLBKQN ( 




13 


1247 


EGAQVARRRSRRQWRAGRCGRGRGGRRAERTGGRGPPGRPRPLP 
PGPARRGRRRMETPFYGDEALSGU3GGASGSGGTPASPGRLFPG 
A?PT7\AAGSMMKKDALTLSLSEQVAAALJCPAPAPASYPPA\ADG 
APSAAPPDGLLASPDLGLLKIAS PELBRL I IQSNGLVTTTPTS S 
QFLYPKYAAS BEQEPAEGFVKALEDLKKQNQLGAGRAAAAAAAA 
AGGPSGTATGSAPPGELAPAAAAPEAPVYA\NLSSY\AGGCRGL 
RGGAAT\VAPAAEPVPFPPPPPPGALGPRRP/RLALQGRRPQTV 
PDVP\SFGESP\PLSPIET\DTPRRI\KAKRKRL\RNPQIRAPK 
PAS RKLGAQSRAUBRESEDPS * S PEHGSIiASTASLLREQVAQLK 
QKVLSHVNSGCQLLPQHQVPAY | 


5919 


1 


4254 


TS VQGDSQGTPTSSQGS INMEHtf t S QAlHdSTTSTTS SSSTQSG 
GSGAAHRLADVMAQTHI ENHSAPPDVTTYTSEHSIQVERPQGST 
GSRTAPKYGNAELMETGDGVPVSSRVSAKIQQLVNTLKRPKRPP 
LREPPVDDFEELLEVQQPDPNQPXPEGAQMLAMRGEQIjGVVTNW 
PP3LEAALQRWGTI S P KAPCLTTMDTNGKPLY I LT YGKIjWTRS M 
KVAYSII^KLGrKQEPMVRPGDRVALVFPNNDPAAFMAAPYGCL 
LAEWPVP I E VPLTRKDAGSQQIGFLLGSCGVTVALTSDACHKG 
LPKSP1GEIPQFKGWPKLLWFVTESKHLSKPPRDWP\PHIKDAN 
NDTAYI E YICTCK\DGS VLGVTVTRTALLTHCQALTQACG YTEAE 
TI VNVLDPKKDVGLWHG1LTSVMNMMHVIS IPYSLMKVNPLSWI 
QKV CQVTCAKVAC^/KSRDMHWALVAHRDQRDINLSSLRMLIYADG 
ANP W S I S S CD AFLNVFQ 3 KGLR QEVI CPCAS S PEALTVA I RRPT 
DDSNQPPGRGVLSMHGLTYGVI RVDSEEKLSVLTVQDVGLVMPG 
AIMCSVKPDGVPQLCRTDEIGELCVCAVATGTSYYGLSGMTKNT 
PEVFAMTSSGAPISEYPFXRTCLLGFVGPGGLVFVVGKMDGLMV 
V5GRRHNADD I VATAIA VEPMKP VYRGRIAVPSVTVLHDERXVI I 
VAEQR ? DSTEEDS PQ WMSRVLQAI DS I HQVG VYCLAL VPANTL P 
KTPLGGIHLSETKQLFLEGSLHPCNVLMCPHTCVTNLPKPRQKQ 
PBIGPASVMVGNLVSGKRIAQASGRDLGQIEDNDQARXFLFLSE 
VLQ WRAQTT P DH I L YTLLNCRGA I ANS I*T CVQLHKRAE K I AVML 

MBRGHLQDGDHVALVYPPGIDLIAAFYGCLYAGCVPITVRPPHP 

QNIATTLPTVKMIVEVSRSAC^iMTTQLIC^^ 

TWPLILDTDD * PKKRPAQI CKPCNPJDTI1AYLDPSVSTTGMI1AGV 

KMSHAATS AFCRS I KLQC E LY PSRE VAI CLDP YCGLG PVLWCLC 

SVYSGHQSILIPPSELETNPALWLLAVSQYKVRDTFCSYSVMEL 

CTKGl^SQTESLKARGLDLSRVRTCVVVAEERPRIALTQSFSKIj 

PKDLGlJIPRAVSTSFGCRVNIiAICLQGTSGPDPTTVYVDMRAIJl 

HDRVRLVERGSPHSLPLMESGKILPGVRIIIANPETKGPLGDSH 

I/3E I WVHS AHNAS G YFTI YGDESLQSDHFNSRLS FGDTQT I WAR 

TGYU3FLRRTELTDANGBRHDALYVVGALDEAMELRGMRYHPID 

rETSVIRAHKSVTECAVFTWTNLLWWELDGSEQEALDLVPLV 

TNWLE EHYL I VGWWVD IGVJ P INSRGEKQRMHLRDGFLADQ 

LDPIYVAYNM 


5920 


13B1 


1499 


QU3AVAHAGVSRI PP*LFPPLHPTFI^bWCLHHKLP/HPPGASM| 
VRPPWPRRPPAHISSVRQASTQVPRTVPHTC2RVANIGTQTTGP 
SGVGCCTPGRPLLPCKCS^AAHSTYRVQEPAVHIPGQEPIjTASM 
LAAAPIiHEQKQMIGERLYPLIHDVHTQLAGKITGMLLEIDNSEI. 
LLMLESPESLHAKIDEAVAVLQAHQAMEQPKAYMH j 


5921 
5922 


727 
2475 


157 
495 


G3SVSNS DDEISS SDSADS CDSLNPPTTAS FTPTS I I/KRQKQLR 
RKNVRFDQVTV YYFARRQG FTS VPSQGGSSLGMAQRHNS VRS YX 
LCEFAQEQBVNHREILREHLKEEKliHAKKMKLTKNGTVESVEAD 
GLTLDDVS DEDIDVENVEVDDYF FLQPLPTKRRRAL LRASGVHR 
IDAEEKQELRAIRLSREECGCDCRLYCDPEACACSQAGIKOQVD 
RMS FPCGCSRDGCGNMAGR X E FT7PIR VRTHYLHTI MKLELES KR 

Q\GAAQQPQ\*GALPDCQLQPDRSTGL*DPSWIGSKGLSFTGKG 
AAATHLI ILRVTENRGAEGKRK | 








S YSNWGIi FPS VFIQ VPRSRTGNLKP IFh FYS YYE \ CMETLKG \ T | 
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-5923- 






CLYNATQYKVCSP^DRPDACYNPSEPAATTVFEIRTGLLLGDf" 
S KI I TRTEE KE I PKQ 1 TLRFDACAAINSKKLEIGOGS LN * ERS * 
R VENKYVCHESGVCKNCAYWPCVI * AT*KKNXNDSVYLQKGEAN 
PS CAAGHCNPLELI I TNPLDPHWKKGERVTlrGINRTGLKPQ WI 
LIKGEVH KCS PKPVFOTF YEELNLPAPEL LKKTKNLPLQLAENV 
I FLLNGTS C YVRGGTT IGDRWPWEA* ELVPTD PRPDTTDt*ttri? 
ASNF**VLKTS 1 1 RQYCIAREGKDFI I P VGKPNC IGQKL YNS TTK 
TIT* *DLNHTEXNPFSKFSKLKTA*AHAESH*DWTVPSGLY*IC 
RHRAYFRLPNKWADSCVIGTIKPSFFIiLPIKMGELLGPSVYASR 
EKKGXVIGNWKDNEWPRERIIOYYGPATUfinTYicur»VD /td Am 
MLNW I IRLQAILE 1 1 SNETGRALTVLAWQETQMRNAIYQNRLAL 
DYLLVAEGGVCRKFNLTNCCLQINDQGQVVKNIVRDMTKLAHVP 
rQVWHKFDPESLFGKWFPAIGGFKTLIVGfVLLVIRTCLLLPCVL 
PLLFQM I KG IVATL VHQKTS AHVNYMNHYRS I S QRDS KS EDE S E 
NSH 


$924 


137 


638 


QLCGRRGGRFRTSiKRMHPI*RTCPMTNL/'liLLSQENtQimr" 
QQBNRELWISLEEHQDALELXMSKYRKQMLQLMVAKKAVDAEPV 
UCAHQSHSAEIESQIDRICEMGEVMRKAVQVDDDQFCKIQEKLA 
QLELENKELRELLS ISS ESLQARKBNSMDTASQAX K 




274 


2146 


EKGKVKDAOAEQWISLSLSCKGSWETQFSNHLNSLTPPTSVRRM" 
PLITTVTLLKMVARHRKKLLCSKAFSTQLQQKIFLHSQMGIHHQ 
SVCMKLKPNTSHI I S ILMGQPMALVQLETLAPLTI I IQKFQTQD 
HMKFWKMLPLHSHHLTPSVPQTVIPKKTGSPEIKLKITKTIQ1TG 
RELFESSLCX3DHiNRVQASE\Q*NQSIESRKEKRKKSNKKDSSR 

w> cuiaiw n rvi ir ruj& f A K irvi a K V tj£ V o £ tvlr KJiJSFVIjiGSGSPSS 

ANTIFCSNNGSVHW\FKFQVGDIjVWSKVGTYPWWPCMVSSDPQI, 
EVHTKINTRGAREYHVQFFSNQPERAWVHEKRVREYKGHKQYEE 
LLAEATKQASNHSEKQKIRKPRPQRBRAQVTOIGIAHAEKALKMT 
REER1EQYTFIYIDKQPEEALSQAKKSVASKTEVKKTRRPRSVL 
NTQPEQTNAGEVASSLSSTEIRRHSQRRHTSAEEEEPPPVKIAW 
KTAAARKSLPAS ITMHKGSLDLQKCNMS PWKI EQVFALQNA TG 
DGKFI DQFVYSTKGT GNKTE ISVRGQDRLI ISTPNQRNEKPTQS 

VSSPEATSGSTGSVEKKQQRRSIRTRSESEKSTEWPKKKIKKE 
QVGFLHVES 


5925 


216 


1911 


MMTAESREATGLSPQAAQEKDGIVIVKVEEEDEBDHMWGQDSTL 
QDTPPPDPE IFRQR FRRFCYQNTFGPREALS RLKEIiCHQ WLRPE 
INTKEQIIiELLVLBQFLSILPKELQVWIiQEYRPDSGEEAVTLLE 

QSHFKHSSRKPRLLQSRALPAAHI PAPPHEGSPRDQAMASALFT 
ADS QAMVK I EDMAVS L I LEEWGCONIiARRNLSRDNBfJRNVRQ a p 

POGGENRWEKfEESTSKAETSEDSASRGETTGRSQKEFGEKRDQE 
GKTGERQQKNPEEKTRKEKRDSGPAIGKDKKTITGBRGPREKGK 
GLGRS FS LS SNFTTPE E VPTGTKSHRCDECGKCFTRS S S L IRHK 
irHTGEKPYECSECGKAF\SLNS\NLVLHQRl\HTGEKPHECNE 
CGKAFSHSSNLILHQRIHSGEKPYECNECGKAFSQSSD\LTKHQ 
R IHTGEK P YECS EOGKAFNRNS YL I LHRR VHTREKP YKCTKOGK 
\AFTRSSTLTLHHRI HARERASEYS PAS LDAFGAFLKS C V 


5926 


2 


233 


DRCLMLKQGSQPGSPPAT/CEPPAPPVYQAPCQSCPEPPGAnEP 
SDSPHHTPVHPPPEHSAACPAPATCCPPPRSSMS 


5927 


4146 


1248 

1 


KHFSKFGSQALYQLKRPASGQNSISVMPAQKITKPAAKYGIPLA 
YKKYGDKKIiHEKKPLQKHKQAHQTPEKRVNTGEERRKISEEAAR 
KRRLEFIEKEKKQKDQI ISLMKAEQMKRQBKERLERINRAREQG 
WRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAI FDQ 
MQQQRAEDNEAXWKRE I YGRGLPERQKGQLAVERAKQVEEFLQR 
KREAMQNKARAEGHMG I LQNLAAM YGGRPSSS RGGKPRNKEEEV 
YIJu^LRQIRLQNFNERQQIJCAKLRGBKKEANHSEGQEGSEEADM 
RRKK\ IESLKAHANARAAVLKEQLERKRXEAYEREKKVMEEHLV 
WCGVXSSDVSPPLGQHETGGSPSKQOMRSVISVTSALKEVGVDS 
S LTDTRETS EEMQKTNNAI S S KR E I LRRLNE NLXAQEDEKG KQN 
CjSDTFEINVHEDAKEHEKEKSVSSDRKKWEAGGQLVIPLDELTL 
DTS FSTTERHTVGEVI KLGPKGS PRRAWGKS PTDSVLKI LGEAE | 



402 



WO 01/53312 



PCT/US00/34263 



1 SEQ 
1 . ID 
NO: 


| Predicted 
beginning 
nucleotide 
location 
corresponding 

I to first 
amino acid 
residue of 

I amino acid 

I sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 
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\*possible nucleotide insertion) 








LQIiQlBijdilSNTTXRSE I S PEGEKY KPLITGEkKVGciSHEINPS 
AIVDSPVETKSPEFSEASPQMSLKLEGNLEEPDDLETEILQEPS 
GTNKDE \ SLPCTITD VW I SBEKETKBTQS ADR I TIQENE VSEDG 
VSSTVDQLSDIHIBPGTNDSQHSKCDVDKSVQPEPFrHKWHSE 
HLNLVPQVQSVQCSPEESPAFRSHSHLPPKNKNKNSLLIGLSTG 
LFDANNP KMLRTCSLPDLS KI*FRTLMDVPTVGDVRQDNLE I DE I 
EDENIKEGPSDSEDIVFEETDTDLQELOASMEQLLREQPGEEYS 
EEEESVLKNSDVEPTANGTDVADEDDNPSSESALNEEWHSDNSD 
GE I AS ECECDS VFNHLEELRLKLEQEMG FEKFFE VYB K I KAXHE 
DEDEN I E I CS KI VQNI LGNEHQHLYAFCELHLVMADG AYQEDNDE 


5928 


| 4146 


1248 


KHESKFGSQALYQLKRPASGQNS 1 S VMPAQKI TKPAAKYG1 PLA 
YKKYGDKKLHEKKPLQKHKQAHQTPBKRVNTGEERRKISEEAAR 
KRRLBFIEKEKKQKDQI I SliMKAEQMKRQEKBRLER INRAREQG 
WRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYKAIFDQ 

mqqqraedneakwkreiygrglperqkgqlaverakqveeflqr 

KRE AMQNKARAEGHMG ILQNIAAM YGGR PS S SRGGKPRNKEEEV 
YUARLRQIRLQNFNERQQIKAKLRGEiCKEANHSEGQEGSEEADM 
RRKK\ IESLKAHANARAAVLKEOLERKRKBAYEREKKVWEEHIiV 
AKG VKS S DVS P P LGQHETGGS PS KQQM RS VI S VTS ALKE VG VDS 
SLTDTRSTSEEMQKTNNAISSKREILRRLNENLKAQEDEKGKQN 
LSDTFE INVHEDAKEHEKEKSVSSDRKKWEAGGQL VIPLDELTL 
DTSFSTTERHTVGEVIKLGPNGSPRRAWGKS ptds vlkilgeab 

uvMVibjuoiu i j.K^nu.&Jt'ii^^Js.i {CPLITGEKKVQCISHEIKTPS 

aivdspvetkspbfseaspqmslklegnleepddlbteilqeps 
gtnkde\slpctitdvwiseeketketqsadritiqenevsedg 

VSSTVDQLSDIHIEPGTNDSQHSKCDVDKSVQPEPFFHKWIISE 

HLNLVPQVQS VQCS P EES FAFRSHS H1»P P KNXNKN3 LL IGLS TG 

LFDANNPKMLRTCSI^DI^KIjFRTI^D^ 

EDENIKEGPSDSEDIVFEETDTDLQELQASMEQLLREQPGEEYS 

EEEES VLKNSDVE PTANGTDVADEDDNPS SES ALN EE WHS DNS D 

GEIASBCECDSVFNHLEELRIiHXjEQEMGFEKFFEVYEKIKAIHE 

DSDENIEICSKIVQNILGNEHQHLYAKILHLVMADGAYQEDNDE 


5929 j 

5930 4 


3 


1558 


IiDFSMTTQLPAYVaILLFYVSRASCQDTFTAAVYEHAAJLPNAT 
LTPVSREEALALMNRNIiDILBGAITSAADQGAHI ivtpedai yg 

WNFNRDSLYPYl»EDIPDPRVWlJTPr , MM»K»OX?rJrtn«n'if'ri-»nr r>j~,r \ 
Mjw^jj » f iwoui e Ufa vein A. ft-Nff KNKFGQT P VQ3RLS CL \ 

aknnsiywanigdkkpcdtsdpqcppdgryqyntdwf\dsqg 
klvaryhkqnlfmgenqfnvpkepeiwfnttfgsfgiftcfdi 
lfhdpavtlvkd frvdtivfptawmnvl phls avefhs awamgm 
rvnflasnihypskkmtgsgiyapnssrafhydmkteegkllls 
qldshpshsavvnwtsyassiealssgnkefkgtvffdeftfvk 

LTGVAGNYTVCQKDLCCHLSYKMS ENI PNEVYALGAFDGLHTVE 

gr yylq 2 ctllkckttnlntcgds aetas tr femfs lsgtfgtq 
ywpbvllsenqlapgefqvstdgrlfslkptsgpvltvtlfgr 

LYEKDHASNASSGIiTAQARI IMLI VIAP I VCS LSW 




113 


60B2 

« 
I 
C 
1 
I 
I 


RGNCFWIVPirTMAQRTGIiEDPERYLFVDRAVIYNPATQADWTAK 

KLVWIPSERHGFEAASIKEERGDEVMVBLAENGKKAMVNKDDIQ 

KMNPPKFSiCVEDMAELTCLNEASVLHNLiCDRYYSGLrYTYSGLF 

CWINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYnCM 

I^DREDQSILCTGESGAGKTENTKKVIQYLAHVASSHKGRKDHN 

I PGE \LBRQLLQANP I LES FGNART VQNDNS S RFGKFIRINFD V 

TGYIVGANIETYLLEKSRAVRQAKDERTFHI FYQLLSG\AGEHL 

KSDLIiLEGFNNYRFLSNGYXPIPGQ\QDKGNFRGDPGEAl^HIKG 

PSHEEILSMLKWSSVLQFGNISFKKERNTDQASMPENTVAQKL 

UHLLGMNVME FTRA ILTPR I KVGRDYVQ KAQTKEQADFAVEALA 

fCATYERLFRWLVHRINKALDRTKRQGASFIGILDIAGFEIFELN 

3FEQLCINYTNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFGL 

3UJPCIDL1ERPANPPGVLALLDEBCWFPKATDKTFVEKLVQEQ 

3SHSKFQKPRQLKDKADFCIIHYAGKVDYKADEWLMKNMDPLND 

IVATLUiQSSDRFVAELWKDVDRIVGLDQVTGMTBTAFGSAYKT 

CKGMFRTVGQLYKESLTKLMATLRNTNPNFVRCI IPNHEKRAGK 

jDPHLVLDQIiRCNGVLEGI RI CRQG FPNtt I VFQEFRQR YEI LTP 
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SEQ~ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A=Alanine, C=Cysteine, D=»Aspartic Acid, E« 
Glutamic Acid, Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P=Proline, Q=*Glut amine, RoArginine, 
S=Se rine, T-Threonine, V=Valine, 
W=.Tryptophan, Y= Tyro sine, X=Unknown, *=>Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








NAIPKGFMDGKQACBRMIRALElLDPNLYRtGQSKIFFRAGVLAH 

LEE ER DLK I TD 1 1 1 F FQ AVCRG YLARKAFAKKQQQLS ALKVLQR 

NCAAYLKLRHWQWWRVFTKVKPLLQVTRQESELQAKDEELLKVK 

BKQTKVEGELEEMERKHQQLLEEKNILAEQLQAETBLFAEAEEM 

RAIUiAAKKQELEEILHDLBSRVEEBBERNQIIiQNEKKKMQAHIQ 

DLEEQLDEEEQARQKLQLEKVTAEAECIKKMEEEILLLEDQNSKF 

IKEKKLMEDRIAECSSQLAEBEEKAKNLAKIRNKQEVMISDLEE 

RLKKEEKTRQELEKAKRKLIX3ETTOLQDQIAEU3AQIDELKLQL 

AKKEEEIjQGAIARGDDETLHKNNALKVVRELQAQIAELQEDFES 

EKASRNKAEKQKRDLSEELEALKTELEDTLDTTAAQQELRTKRE 

QBVAELKKALEEETKNHEAQIQDMRQRHATALEELSEQLEQAKR 

FKANLE KNKQG LETDNKELACE VKVLQQ VKAESE HKRKKLDAQV 

QELHAJCVS EGDRLRVELAEKAS KLQNELDNVSTL LEE AE KKG I K 

FAKDAASLESQLQDTQEIiLQEETRQKLNLSSRIRQLEEEKNSLQ 

EO^EEEEEARKNLEKQVLALQSQLADTKKKVDDDLGTIESIjEEA ' 

KKKLL KDAEALS QRLE BKALAYDKLE KTKNRLQQELDDLTVDLD 

hqrqvasnlekkq\kkfdqllaeeksisaryaeerdraeaeare 

KETKALS LARALEEALEAKEEFERQNKQLRADMEDLMSSKDDVG 
KNVHELEKSKRALEQQV\EEMRTQLEELEDBLQATEDAKLRLEV 
NMQAMKAQPERDLQTRDEQNEEKKRliLIKQVRELEAELEDERKQ 
RALAVAS KKKMBI DLKDLEAQIEAANKARDE VI KQLRKLQAQMK 
DYQRELEEARASRDEIFAQSKESEKKLKSLEAEILQLQEELASS 
ERARRHAEQERDELADEITNSASGKSALLDEKRRLEARIAQLEE 
ELEEEQSN>ffiLLNDRFRKTTLQVDTLMAELAAERSAAQKSDNAR 
QQLE RQN KBLKAKLQELBGAVKS KFKATISALEAXIGQLEEQ LE 
QEAKBRAAAmCLVRRTEKKLKEIFMQVEDERRKADQYKEQMEKA 
NARMKQLKRQLBEAEEEATRANASRRKLQRELDDATEANEGLSR 

EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNBTQPPQSE 


5931 


113 


6082 

! 


RGNCTWIVPFTMAQRTGI^DPBRYLFVDRAViY^PATOADTfTAir" 
KLVWIPSERHGFEAASIKBBRGDEVMVEIjAEWGKKAMVNKDD iq 
KMNPP KFS KVBDMAELTCLNEAS VLKNLKDRYYS G LI YTYSGLF 
CVVINPYKNLPXYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREDQS I L CTG ESGAGKTENTKKV I Q YLAHVAS SHKGR KDHN 
IPGE\LERQLLQANPIIiESFGfNARTVQNDNSSRFGKFrRINFDV 
TGYIVGANIETYLLEKSRAVRQAKDERTFHIFYQLLSG\AGEHL 
KSDLLLEGFNNYRFLSNGYIPI PGQ\QDKGNFRGDPGEAMHIMG 
FSHEElLSMLKWSSVIiQFGNISFKKERNTDQASMPENTVAQKL 
CHIiLGWNVI^FTRAILTPRIKVGRDWQKAQTKEQADFAVEALA 
KATYERLFRWLVHRINKALDRTKRQGASFIGILDIAGFEIFELN 
SFEQLCINYTNEKLQQLFNHTMPILEQEEYQRBGIEWNFIDFGL 
DLQPCIDL IERPANPPGVLALLDEECWFPKATDKTFVEKLVQEQ 
GSHSKFQKPRQIjKDIGADFCIIHYAGKVDYKADEWLMKNMDPIjND 
NVATLLHQS SDRPVABLWKDVDR IVGLDQ VTGMTETAFGS AYKT 
KKGMFRTVGQLYKESLTKLMATLRNTNPNFVRCI ipnhekragk 
LDPHLVLDQLRCNGVLEGIRICRQGFPNRIVFQEFRQRYEXLTP 
NAIPKGFMDGKQACERMIRALBIiDPNLYRIGQSKIFFRAGVLAH 
LEEERDLKI TDI I IFFQAVCRG YLARKAFAKKQQQLS ALKVLQR 
NCAAYlilOIJ^WQ WWRVFTKVTCPLLQ VTRQEE ELQAKDEELLKVK 
EKQTKVEGELEEMERKIIQQLLEEKNILAEQLQAETELFAEAEEW 
RARJLAAKKQELEEILHDLES RVEEEEERNO ILQNEKKKMQAH I Q 
DLEEQLDEEEGARQKLQLEKVTAEAKIKKMEEEILLLEDQNSKF 
iKBKKLMEDRIAECSSOLAEEEEKAiCNLAKIRKricniTUMTQnT pi? 
RL KKEEKTRQELEKAKRKLDGETTDLQDQ I AELQAQ I DEL KLQL 
AKKEEEI<y3AIJUiGDDETIiHKNNALICVVRELQAQIAELQEDFES 
EKAS RNKAE KQ KRDLSBELEAiXTELEDTLDTTAAQQELRTKRE 
3EVAELKKALEEETKNHEAQIQDMRQRHATALEELSEQLEQAKR 
FKANl^KNKG^LErTDNKELACEVKVLQQVKAESEHKRKXLDAQV 
2ELKAKVSEX3DRLRVELAEKASKGQNELDNVSTLLEEAEKXGIK 
FAKDAASLESQLQDTQBLLOEETRQKLNLSSR IRQLEEEKNSLQ 
SQ^gEEEEARKNLEKQVIiAIjQSQLADTKKICVDDDLGTIESLEEA 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CaCysteine, D=»Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, GoGlycine, 
H=Histidine, I-Isoleucine, K^Lysine, 
JU-Leucine, M*Mcthionine, NaAsparagine , 
P»Proline, Q^Glutamine, R=Arginine, 
So Serine, T=Threonine, V=Valine, 
W=»Tryptophan, Y«Tyrosine, X=DnJcnown, *«Stop 
Codon, /=possible nucleotide deletion, 
\opossible nucleotide insertion) 








KKKLLKDAEALSQRliE E KALAYDKLEKTKNRLQQE LDDLTVDLD 
HQRQ VAStTLEKKQ\ KK FDQLLAE EKS ISARYAEERDRAEAEARE 
KETKAIiSLARALEHALEAKEEPERQNKQLRADMEDLMSSKDDVG 
KNVHELEKS KRALEQQV\ EEMRTQLEELEDELQATEDAKLRL BV 
NMQ AM KAQ F E RDLQTRD E QNE E KKR LL I XQVR ELBA E LE DER KQ 
RAIiAVASKKKMEIDLKDLEAQIEAANKARDEVTKQLRKLQAQMK 
DYQRELEBARASRDE1 FAQS KESEKKLKSIiEAEILQLQE 3IASS 
BRARRHAEQERDELADE I TNS AS Q KS ALLDEKRRLEARI AQLBE 
BLEEEQSNMELWTORFRKTTLQVDTLNAEIAAERSAAQKSDNAR 
QQLERQNKELKAKLQELEQAVKSKFKATISALEAKlGQIiEEQLE 
QEAKERAAANKLVRRTE KKLKE I FMQVBDERRHADQYKEQME KA 
KARMKQLKRQLEEAEEEATRANASRRKLQRELDDATEANEGLSR 
EVSTJjKNRJjRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 
TSDVNETQPPOSE 


5932 


33 " 


£72 


RHLEE I CFLFLQKGRKLKLSGPR WEEGKPRGTGGLW VKAEANMG 

FGATLAVGLTIFVIjSVVTII icftcsccclyktcrrprpv\app 
PHPP/PWHAPYPQPPSVPPSYPGPSYQGYHTMPPQPGMPAAPY 
PMQYPPPYPAQPMGPPAYHETLAGGAAAPYPASQPPYNPAYMDA 
PKAAL 


5933 


1 


3190 


GTRKLKMADKTPGGSQKASSKTRS^bVHSSd^SDAHMDASGiP^D " 
SDMPSRTRPKSPRiCHNYRNESARESLCDSPHQNLSRPLLENKXK 
APS IGKMSTAKRTLSKKEQEELKKKEDEKAAAEIYEEFLAAFEG 
SDGNKVKTFVRGGWNAAKEBHBTDE KRGKI YKPSSRRADQKNP 
PKQSSNERPPSLLVIETKKPPLKKGEKEKKKSNIiBLFKEBliKQI 
QE BRDERHKT KGRLSR FE PPQSDS DGQRRSMDAPSRRNRSSG VL 
DDYAPGSHDVGDPSTT\NFYLGNI \NPQMNLKKCCCQEFGRFGP 
IJ^\nCIMWPRTDEERARERNCGPVAFMNRRDAERALKN^GKMI 
MSFEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPFNAQP 
RERLKNPNAPMLPPPKNKEDPBKTLSQAIVKWIPTERNUAL1 
HRMIEFVVR^GPMFEAMlMNREINNPMFRFI,FENQrPAHVYYRW 
FCLYSILQGDSPTKWRTBDFRMFKNGSFWRPPPLNPYLHGMSEEQ 
ETEAFVEEPS KKGALKBEQRDKLBEILRGLTPRKNDIGDAMVPC 
LNNAEAAEE I VDC I TBS I*S I LKTPLPKKIARL YLVSDVL YNS SA ' 
KVANAS YYRKFFETKLCQI FSDLNATYRTIQGKLQSENFKQRVM 
TCFRAWEDMAIYPEPFIiIKLQNIFIiGLVNIIBEKETi2DVPDDLD 
GAP I EEE LDGAPLEDVDG I P I DAT P I DDLDGVP I KS LDDDLBG V 
P LDATEDS KKNE P I FKVAPS KWEAVD E S ELEAQAVTTS KWEL FD 
QHEESEEEENQNQEEESEDEEDTQSSKSEEHKLYSNPIKEElflTE 
S KFS KYS EMSEEKRAKLRE I ELKVMKFQDELESGKRPKKPGQS F 
QEQVEHYRDKLLQREKEKELERERERDKKDKEKLESRSKDKKEK 
DECTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPKSERSERSER 
SHKESSRSRSSHKDSPRDV5KKAKRSPSGSRTPKRSRRSRSRSP 
XKSGKKSRSQSRS PHRSHKKS KGKTNTGRKPFKKAVTYWKCDLF 
LCPERSVF 


5934 


1 


3190 


GTRKLKMADKTPGGSQKASSKTRSSDVHSSGSSDAHMDASGPSD 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 
AFSIGKMSTAKRTLSKKEQEELFCKKBDEKAAAEIYEEFLAAFEG 
SDGNKVKTPVRGGWNAAKEEHETDEKRGKIYKPSSRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 
Q EERDE RHKTKGRLS R FE P PQS DSDGQRRS MDAPS RRNRSSG VL 
DDYAPGSHDVGDPSTT\NFYLGNI\NPQMNLKKCCCQEFGRFGP 

MSFEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPFNAQP 
RERLKNPNAPMLPPPKNKEDFEKTt^QAIVKWIPTERNLLALI 
HRMIEFVVREGPMFEAMIMNRE INNPMFRFLFENQTPAHVY YR W 
KLYS ILQGDS PTKWRTEDFRMFKNGSFWRPPPLNPYLHGMSEEQ 
ETEAFVEE PS KKGALKEEQRDKLEB ILRGLTPRKNI) IGDAMVFC 
LN NAfiAAE E I VDCI TES LS ILKTPLPKKI ARLYLVSDVLYNS SA 
KVANAS YYR KFFETKLCQIFSDLNAT YRTIQGHLQS ENFKQRVM 
TCFRAWEDWAI YPBPFIilKLQNIFLGLVNI IEEKETEDVPDDLD 
GAPIEEELDGAPLEDVDGIP IDATPIDDUDGVPIKSLDDDLDGV 



405 



WO 01/53312 



PCT/US00/34263 



SBQ 
ID 

NO: 


rle dlCCcu 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{AaAlanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F«Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K« Lysine, 
L=Leucine, M=Mefchionine, N-Asparagine , 
PoProline, Q-Glut amine, R=Arginine, 
S«Serine, T=Threonine, VaValine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *«stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 








PLDATEDSKKNEP IFXVAPSKWEAVDES ELEAQAVTTS KWELFD 
QHEESEEEENQNQEEESBDEEDTQSSKSEEHHLYSNPIKEEMTB 
SKFSKVSEMSEEKRAKLREIELKVMKFQDELESGKRPKKPGQSP 
QEQVEHYRDKLLQREKEKELBRERERDKKDKEKLESRSKDKKEK 
DE CTPTR KERKRRHSTS PSPS RSSSGRRVKS PS PKS ERSERS ER 
SHKESSR5RSSHKDSPRDV3KKAKRSPSGSRTPKRSRRSRSRSP 
KKSG KK5R5QS RSPHRSHKKS KGKTNTGRKFFKKAVTYW KCDLP 
LCPERSVP 


5935 


3 


4493 


SyWZ^GWRLSRPPRQFWAGWRGIGRFGTMAPVHGDDCEIGASAL 

SDSGSPVSSRARREKKSKKGRQEALERLKKAKAGERYKYEVEDF 

TGVYE EVDEEQ YS KLVQARQDDDW I VDDDG IG YVEDGRE I FDDD 

LEDDAIiDADEKGKIXJKARNKDKRNVKKIiAVTKPNNIKSMFIACA 

GKKTADKAVPLSKDGLLGDILQDLNTETPQITPPPVMILKKKRS 

IGASPNPFSVHTATAVPSGK1ASPVSRKEPPLTPVPLKRAEPAG 

DD VQ VESTEE EQES GAME FEDGD FDE PME VEE VDL E P MAAKAWD 

KESEPAEEVKQ2ADSGKGTVSYLGSPLPDVSCWDIDQEGDSSPS 

VQEVQVDSSHLPLVKGADEEQVFHPYWLDAYEDQYNQPGWPLF 

GKVWIESABTKVSCCVMVKNIERTLYPliPREMKIDLNTGKETGT 

PIS MKD WEE FDEKIATKYKIMKPKSKPVE KNYAFE I P DVPEKS 

EYLEVKYSAEMPQLPQDLKGETFSHVFTSTNTSStiEXFLMNRICIK 

GPCWLEVKKSTALNQPVSWCKVEAMALKPDLVNVIKDVSPPPLV 

VMAFS M KTMQN AKNHQNB I IAMAALVHHSFALDKAAPKPPFQSH 

FCWS KPKDC I FPYAPKEVT EKKNVKVEJVAATERTLLG FFLAKV 

HKIDPDIIVGHNIYGFI^EVIiLQRINVCKAPHWSKIGRLKRSNM 

P KLGGRSGFGERNAT CGRM I CD VE I S AKEL I RC KS YHL SEL VQQ 

ILKTERWIPMENIQNMYSE3SQLLYLLEHTWKDA\KFII»QIMC 

ELNVLPLALQITNIAGNIMSRTLMGGRSERNBFLLLHAFYENNY 

IVPDKQIFRKPQQKLGDEDEEIDGDTNKY1TORKKGAYAGGLVL 

DPKVGFYDKFI LLIiDFNSLYPSI IQEFNICFTTVQRVASEAQICV 

TE DG EQ3QI PE LPDPS LEMG I LPRE I RKLVERRKQ VKQLMKQQD 

LN P DLI LQ YD I RQKALKLTANSMYGCLGFS YS RFYAKPLAALVT 

YKGRE IXiMHTKEMVQKMNLE VI YGDTDS IMINTNSTNLEB VFKI* 

GNKVKSEVNKLYKLLEIDIDGVFKSLLLLKKKKYAALWBPTSD 

GNYVTKQELKGLDIVRRDWCDLAKDTGNFVIGQILSDQSRDTIV 

ENIQKRJblEIGENVLNGS VPVSQFE INKALTKDPQD Y PDRKS LP 

HVHVALWINSQGGRKVKAGDTVS YVI CQDGSNLTAS QRAYAPEQ 

LQKQDNLTIDTQYYIiAQQIHPVVaRICEPIDGIDAVLIATGWEI. 

\DPTQFlCVHHyHKDEENDALLGGPAQLTDEEKYRDCERFKCPCP 

TCGTENIYDNVFDGSGTDMEPSLYRCSNIDCKASPLTFTVQLSN 

KLIMD I RRFIKKYYDGWL I CEEPTCRNRTRHLPLQFSRTGPLCP 

ACMKATLQPEYSDKSLYTQLCFYRYI FDAECALEKLTTDHEKDK 

LKKQFFTPKVLQDYRKLKNTAEQFLSRSGYSEVNLS KLFAGCAV 

KS 


5936 


1124 


139 


RGEEQFDAEFRRFACLGFGERLQEFSPJLLRAVHRSRAWTCYLAI 
RMLMATCCPSPTTTACTOPWQRAPPLRLLVQKREADSSGLAFAS 
NSIiQRRKKGLLLRPVAPLRTRPPLLISLPQDFRQVSSVIDVDLL 
PETHRRVRLHKHG S DRPLG FY I RDGMS VR VAPQG \ L ER VPG I FI 
SRLVRGGLAESTGLLAVSDEI LE V$G I EVAGKTLNQVTDMMVAN 
SHN\LIVTVKPANQRNNWRGASGRLTGPPSAGPGPAEPDSDDD 
SSDLVIENRQPPSSUGLSQGPPCWDLHPGCRHPGTRSSLPSLDD 
QEQASSGWGSRIRGDGSGFSL 


5937 


31 


1600 


PTqriLKSTvniJvsrPM/inKPvnrvvQT.aPTT?KVT agpvi/Tr \m — 

c isiiuAO i v ULd'iLttJjijy ui\K J yc. v ioJUrvc» A cr^v LjJM> r i v i.L»V X JL» 
YGLTSSYSLWWMLRSSLKQYS PEALREKSNYSDIPDVKNDFAPI 
IJiLJUDQYDPLYSKRFSIPliSEVSENKLKQINLNNEVfTVEKLKSK 
LVKNAQDKI ELHL FMLNGLPDNVFELTEMEVLSLEL I PEVKLPS 
AVSQLVNLKELRVYHSSLWDHPALAFLEEKLKI LRLKFTEMGK 
IPRWFHLKNLKELYIjSGCVIiPEQLSTMQ1*EGTODLKNLRTLYL 
KSSLSRIPQVVTDIiLPSLQKLSLDNEGSKLVVIiNNLKKMV^ 
LELISCDLERIPHSI FSLNNLHELDLRENNLKTVEE 1 1 SFQHLQ 
NLSCLKLWHNNI AYI PAQIGALSNLEQLSLDHNNIENLPLQLFL 
CTKLHYLDLS YNHLTF I PEE I QYL\SNLQYFAVTNNNIEMI,PDG 
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1 ssq" 

ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


5» ^ w «"*VBiJ4AlllJ OJL^llC&J. ^C^LIQC 

(A=Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S= Serine, T»Threonine, V«valine, 
W-Tryptophan, Y»Tyrosine, X=Unknown, +=Stap 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LFQCKKLQCLLIXjKNSIiMNIiS PHVGELSNLTHREPIG \ W YU5TI* ~ 
P PELBGCQSLKRNCL I VE ENLLNTLPLPVTERLQTCLD KC 


5938 


395 


1865 


YKGEGFFCNQEARGERRKKKKAMSSPNI WSTGSSVYSTPVPSQK ™ 
MTVWILIiLLSLYPGFTSQKSDDDYEDYASNKTWVLTPKVPBGDV 
TVILNNLLEGVDNKliRPDIGVKPTLIHTDMyVNSIGPVNAINME 
YT I DI F F AQTW YDR RL KFNS T I K VXiRLNSNMVG KI W I PDT FF RN 
S KKADAHWI TTPNRMLR I WNDGR VLYSLRLTI DAECQIiQIjHN FP 
MDEHSCPIiEFSS YG YPR E EI VYQWKRSS VEVGDTRSWRL YQFSF 
VGZiRNTTE WKTTSGDYWMSVYFDLSRRWGYFTIQT YI PCTLI 
VVLSWVSFWINKDAVPARTSLG1TTVLTMTTLSTIARKSLPKVS 
YVTAMDLFVS VCFI FVFSALVE YG \TLH Y FVSNRKPS KD KD KKK 
KNPAPTID IRPRSATIQMNNATHLQERDEEYGYECLDGKDCAS P 
FCCFEDCRTGAWRHGRIH I R I A KMDS YAR I FFPTAFCIiFNIiVYW 
VSYLYL 


5939 




1404 


I RPG YLKEVQENS PGHRAG LEP FFDF I VS I ^GSRLNKDWdtZ*K5 
LLKANVEKPVKMLIYSSKTT.HT.PPTCv*TOCisrr uprnnTT/riroTn 

FCSFDGANENVWHVLEVESNSPAAIJU3LRPHSDYIIGADTVMNE 
SBDLF S L I ETH EAKPLKL YV YNTDTDNCRE VT I TPNSAWGGEGS 
LGCG I GYG YLIIRI PTR PFEEGKKI S LPGQMAGTP I TPLKDGFTE 

VQLSSVNPPSLSPPGTTGIEQSLTGLSISSTP\PAVSSVIiSTGV 
PTVPVLLPPGVNOSLTSVPPMESSV7.WT.PnT.MDT?Ti7rvT t>mt or\ 

PSTFNLPR\PTHSVTPGVGljYQEFVKPGVliPPLSSMPPRNLPG\I 
APLPLPSEFLPS FPLVPESSSAASSGELLSSLP PTSNAPSDPAT 
TTAKADAASSLTVDVTPPTAKAPTTVEDRVGDSTPVSEKPVSAA 
VD ANAS ESP 


5940 


145 


717 


RRSASRSAS PRQSAGTAVTTGTRAGGTCIjAAAHHRMRWRADGRS 
LEKLP VHMGLVI TEVEQEPS FSDIAS LWWCMAVG IS YISVYDH 
QGIFKRNNSRLMDEILKQQQELLGLDCSKYSPEFANSNDKDDQV 
LNCHLAVKVLS PEDGKAD I VRAAQDFCQLVAQKQKRPTDLDVDT 
LA\VYliVQMWLILI 


5941 


13 


6147 


MCLGRMGASSPRSPEPVGPPAPGLPFCCGGSlXAVVVLLALPVA ' ' 

WGQCNA?EW\LPFARPTNX,TDEFBFPIGTYLNYECRPGYSGRPF 

SIICLKNSWTOAKDRCRRXSCRNPPDPVNGMVHVrKGIQFGSQ 

rKYSCTKGYRLIGSSSATCIISGDTVIWDNETPICDRlPCGLPP 

TITNODFISTNRENFHYGSWTYRCNPGSGGRKVFELVGEPSIY 

CTSNDDQVGIWSGPAPCCIIPNKCTPPNVENGILVSDNRSLFSL 

NEWEFRCQPGFVMKGPRR VKCQALNKWE PELPS CS RVCQPP PD 

VLHAERTQRDKDNFS PGQE VFYS CB PG YDLRGAAS MRCTPQGD W 

SPAAPTCEVKSCDDFMGQLLNGRVLFPVNLQLGAKVDFVCDEGF 

QLKGSSASYCVLAGMESLWNSSVPVCEQXFCPSPPVIPNGRHTG 

KPLE VFP FGKAVNYTCDPH PDRGTfl PDLI GEST IRCTSDPQGNG 

VPfSS PAPRCGILGHCQAPDHFLFAKLKTQTNASDFP IGTSLKYE 

CRPE YYG RP FS I TCLDNL VWS S P KDVCKRJCS CKTP P DP VNGMVH 

VITDIQVGSRINYSCTTGHRLIGHSSAECIIiSGNAAHTVSTKPPI 

CQRIPCGLPPTIANGDFISTNRENFHYGSWTYRCNPGSGGRKV 

FELVGEPSIYCTSNDDQVG1WSGPAPQCIIPNKCTPPNVENGIL 

VSDNRSLFSLNEWEFRCQPGFVMKGPRRVKCQALNKWEPELPS 

CSRVCQPPPDVLHAERTQRDKDNFS PGQE VFYS CEPGYDLRGAA 

S MRCTPQGDWS PAAP TCEVKS CDDFMGQL LNGRVLFPVNLQLGA 

KVDFVCDEGFQLKGSSASYCVLAGMESLWNSSVPVCEQIFCPSP 

PVIPNGRHTGKPLEVFPFGKAVNYTCDPHPDRGTSFDHGESTI 

RCTSDPQGNGVWSSPAPRCGI LGHCQAPDHFLFAKLKTQTNAS D 

FPIGTSLKYECRPBYYGRPFS ITCLDNLVWSS PKDVCKRKSCKT 

PPDPVNGMVHVITDIQVGSRINYSCTTGHRLIGHSSAECILSGN 

TAHWSTKPPiaQRIPCGLPPTIANGDFISTNRENFHYGSWTYR 

CNLGSRGRKVFELVGEPSIYCTSNDDQVGIWSGPAPQCIIPNKC 

TP PNVENG ILVSDNRS LFSLNEWE FRCQ PGFVMKG PRR VKCQ A 

tiMKWEPBLPSCSRVCQPPPEIIiHGEHTPSHQONFSPGQEVFYSC 

EPGYDLRGAASLHCTPQGDWSPEAPRCAVKSCDDFLGQLPHGRV 

L F PLNLQLGAKVS FVCD EGFRL KG S S VSHC VliVGMRS LWNNS V P 

VCEHIFCPNPPAILNGRHTGTPSGDIPYGKEISYTCDPHPDRGM 
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SEQ 
ID 

NO: 



Predicted 



beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



5942 



4509 



Predicted end 
nucleotide 
location 
corr e spending 
to first 
amino acid 
residue of 
amino acid 
sequence 



! Amino acid segment containing signal peptide" 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G«Glycine, 

| H=Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=*Asparagine, 
P=Proline, Q«Glutamine, R-Arginine, 
S=Serine, T=Threonine, V= Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide Insertion) 



688 



TFNLIGESTIRCTSDPHGNGVWSSPAPRCkLdVRAGHCKTPEQP" 
P FAS PTI P I ND FE F PVGTSLNYECRPGYFGKMFS I S CLENL VW S 
[ SVEDNCRRKSCGPPPEPFNGMVHINTDTQPGSTVNYSCNEGFRL 
IGSPSTTCLVSGNNVTWDKKAPICEIISCBPPPTISNGDPYSNN 
RTSPHNGTVVTYQCHTGPDGEQLFELVGERS I YCTS KDDQVGVW 
SSPPPRCISTNKCTAPEVENAIRVPGNRSFFSLTEIIRFRCQPG 
FVMVGSHTVQCQTNGRWGPKLPHCSRVCQPPPEILHGBHTLSHQ 
DNFSPGQE VF YS CEPS YDLRG AAS LHCTPQGDWS PEAPRCTVKS 
CDDFLGQLPHGRVLLPLNLQLGAKVSFVCDEGFRLKGRSASHCV 
LAGM KALWNS S VP VCEQI FCPN P PA I LNGRHTGTPLGD I P YGKE 
VS YTCD PHPDRGMTFNLIGEST I R RTS EPHGNGVWSSPAPRCEIj 
PVGAACPHPPKIQNGHYIGGHVSLYLPGMTISYTCDPGYLIjVGK 
GFI F CTDQG I WS QLDH YCKE VN CS FPLFMNG I SKE LEMKKVYHY 
GDYVTLKCEDGYTLEGSPWSQCOADDRWDPPLAKCTSRTHDALI 
VGTLSGTIFFILtillFLSWIILKHRKGNNAHENPKEVAIHLHSQ 
GGSSVHPRTLQTNEENSRVLP 



YLYTOMRANPl4AYGISHI0\YQIDPPL\RKHREQVLVIB\VGRKL" 
0K\AQM I RPEERTG YFSSTDIjGRTASHYYIKYNTI ETFNELFDA 
HKTBGDI FA I VS KAEEFDQ I KVREEE lEELDTLIiSNFCELSTPG 
GVENSYGKINIIJ^YINRGBMMFSLISDSAYVAQNAARIVRA 
LFE IALRKRWPTMTYRLLNIiS KAIDKRLWGWAS PLRQPS I LP PH 
MLTRLEEKKLTVDKLKDMRKDEIGHILHHVNIGLKVKQCVHQIP 
S VMMEAFI QP ITRTVLRVTLS I YAD FTNNDQ VHGTVGE P W W I WV 
EDPTNDHI YHSEYFEALKKQVISKEAQLLVFTI PI FEPLPSQYY 
I RAVSDRW EX5AEAVCI INFQHI*ILPERHP PHTELLDLQPLP ITA 
LGCKAYEALYNFSHFNPVQTQI FHTLYHTDCNVLLGAPTGSGKT 
VAAELAI FRVFNKY PTSKAVYI APLKALVRERMDD WKVR I EEKL 
GKKVIELTGDVTPDMKSIAKADLIVTTPEKWCGVSRSWQNRNYV 
QQVTILI I DEIHLLGEERGPVLEVI VSRTNFISSHTEKPVRI VG 
kSTALANARDIADWLNlKQMGLFNFRPSVRFVPLEVHIQGFPGQ 
H YCPRMASMNKPAFQAIRSHS PAKP VLI FVSSRRQTRLTALEI*! 
A?LATEEDPKQWLNMDEREMENIIATVRDSNLKLTIJIFGIGMHH 
[ AGLHERDRKTVEELPVNCKVQ VL I ATS TIAWGVNFPAHLVI I KG 
TE YYDGKTRRYVDFP ITDVLQMMGRAGRPQFDDQGKAVILVHD I 
KKDPYKKFLYEPFPVESSLLGVLSDHLNAEIAGGTIT3KQDALD 
YITWTYPFRRLIMNPSYYNLGDVSHDSVNKFLSHLIEKSLIELE 
LS YC I E I GEDNRS I E PLTYGRI AS YYYLKHQTVKM FKDRLKPE C 
S TEELL S I LSDAEEYTDLPVRHNEDHMNS ELAKCL P I ES NPHS F 
DS PHTKAHLLI43AHLS RA^P CPDYDTDTKT VLDQALRV^ 
DVAANQGPTLVTVLNITNLIQMVIQGRWLKDSSLLTLPNIENKHL 
HLFKKWKPIMKGPHARGRTSIECLPELIHACX3GKDHVFSSMVES 
ELHAAKTKQAWNFLSHLPEINVG ISVKGS WDDLVEOHNELSVST 
LTADKRDDNKWIKI»EADQEYVLQVSIjQRVHFQF11KGKPESCAVT 
PR FP KS KDEGWFLI LGEVDKRBLIALKR VGY IRNHHVAS LSF YT 

PEIPGRYIYTLYFMSDCYLGLDQQYD/NLSQRYTSESFCTGQHQ 
Oh 



22*74 | DKPTRHKTYLSSSWAKMAAAEGP\^DGk'LWdTWI^NHVVFLRLR^ 

eglknqs pteaekpassslpss pppqlltrnwfglggblflwd 
gedssflwrlrgpsggg\eepalsqyqri»i»cinpplfeiyqvl 
lsptqhhvaligikglmvlexpkrwgknsefeggkstvncsttp 

VAERFFTSSTSLTLiCHAAWYPSEILDPHVVLLTSDNVIRIYSLR 
EPO/TPTNVIIIiSEAEEESLVLNKGRAYTASLGETAVAFDFGPLA 
AVPKTLFGQNGKDEWAYPLYILYENGETFLTYISLLHS PGN / 1 
WKAVGSIAHAS\AAEDNYGYDACAVLCLPCVPNILVIATESGML 
YHCVVIiEGEEEDDHTSEKSWDSRIDLIPSLYVPECVEIiELALKL 
AS GE DDPFDSDFSCP VKLHRDP KC PSRYHCTHEAGVHS VGLTWI 
HKLHKFI/JSDEEDKDSLQELSTBQKCPVEHILCTKPLPCRQPAP 
IRGEWIVPDILGPTMICITSTYBCLIWPLLSTVHPASPPLLCTR 
EDVEVAESPLRVLAETPDS FEKHIRSI LQRS VAKPAFLKAS EKD 
I AP PPEECLQLLS RATQVFREQ YI LKQDLAKEE I QRRVKLLCDQ 
KKKQL BDLS YCREERKS liREMABRLADKYEEAKEKQED IMNRMK 
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SEQ 
ID 

NO: 


Predicted 
beginning- 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


»v*xu a ey uiciiiL. containing signal peptide 
(A=Alanine, OCysteine, D«Aspartic Acid, B» 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, Islsoleucine KeaLvsinA 
L=Leucine, M=Methionine, N=Asparagine , 
P«Proline, Q=Glutamine, R^Arginine, 
S=Serine, T»Threonine, V»Valine, 
Wt= Tryptophan/ Y=Tyrosine, X= Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\«=possible nucleotide insertion) 








KLlLHS FHSE L P VLSDS ERDWXKELQL I PDQLRHLGNAIKQVTMK 
KD YQQQKMEKVLSLPKPT 1 1 LS AYQRXCIQS ILKEEGEH I R E M V 
KQINDIRNHVNF 


5944" 


167 


342B 


PS IAT FTDEPEVLTEPPSATTTTTIG I S ATWTTLAGSHGKRNNT 
ITTTSS KRKNRKNKITPBNVQI IPDDPLPISYSQPEKVNGES KS 
SSTSESGDSDNMRISSCSDESSNSNSSRKSDNHSPAWTTTVSS 
KKQPSVLVTFPXEERKSVSGKASIKLSETISEGTSNSLSTCTKS 
GPS PLSS PNGKLTVA5PKRGQKREEGWKEWRRS KKVSVPSTVI 
SRVIGRGGCNINAIRBFTGAHIDIDKQKDKTGDRIITIRGGTES 
TRQATQL INAL I KDPDKB I DELI PKNRLKS SSANS KIGSS APTT 
TAANT3LMGIKMTTVALS STS QTATALT VP AIS SAS THKTI KJTP 
VN\NVRPGFPVSFP\LAYPPPQFAHALLAAQTPQQIRPPRLPMT 
HFGGTFPPAQSTWG PFP VRPLS PARATN S P KPHMVP RHS NQNS S 
GSQVNSAGS LTSS PTTTTS S S ASTVPGTSTNGS PS S P S VRRQL F 
VTWKTSNATTTTVTTTASNNNTAPTNATYPMPTAKEHYPVSS ? 
SSPSPPAQPGGVSRNSPLDCGSASPNKVASSSEQEAGSPPWET 
TNTRPPNSSSSSGSSSAHSNQQQPPGSVSQEPRPPLQQSQVPPP 
EVRMTVPPLATSSAPVAVPSTAPVTYPMPQTPMGCPQPTPKMET 
?AIRP PPHGTTAPHKN9AS VQNSSVAVLSVNHIKRPHSVPSS VQ 
LPSTLSTQSACQNSVHPANKPIAPNFSAPLPFGPFSTLFENSPT 
SAHAFWGGSWSSQSTPESMLSGKSSYLPNSDPLHQSDTSKAPG 
FRPPLQRPAPSPSGIVNMDS PYGSVTPSSTHLGNFASNISGGQM 
YGPGAPLGGAPAAANFNRQHFSPLSLLTPCSSASNDSSAQSVSfl 
GVRAPS PAPSSVPLGS EKPSNVSQDRKVP VPIGTERS AR I RQTG 
TSAPSVIGSNLSTSVGHSGIWSFEGIGGNQDKVDWCNPGMGNPM 
IHRPMSDPGVFSQHQAMERDSTGIVTPSGTFHQHVPAGYMDFPK 
VGGMPFS VYGNAM I PPVAPIPDGAGGPI FNGPHAADPSWNSLIK 
MVS SSTENNGPQTVWTG PWAPHMNS VHMNQLG 


5945 


1462 


X P f 


GVTHL FLFGKRKLRNG IAEOLKGQADFF FLL VS EA WATGS PRA 
WLTCLXLPLPGIIFSVLPKAMSRPLLITFTPATDPSDLWKDGQQ 
QPQPEKPES TLDGAAARAF YEALIGDE S SAPDSQRS QTEPARER 
KRKKRR IMXAPAAEAVAEGASGRHGQGRSLEAEDKMTHRI IiRAA 
QEGDLPEIiRRLIiEPHEAGGAGGNINARDAFWWTPLMCAARAGQO 
AAVSYlI/SRGAAWGVCELSGRDAAQIiAEEAGFPEVARX-IVRESH 
GBTRSP ENRSPTPS LQ YCENCDTHFQDSNHRTS TAKLLS LSQG P 
QPPNLPLG VP I S SPGFKLIiLRGGWBPGMGLG PRGEGRAN PI PTV 
LKRDQEG LG YRSAPQ PRVTH FP AWDTRAVAGRE \TPPRVATLSW 

R1?RRPPT?R\ VnP21l(JT?DnT.O'7*VlurKTT E»B» 


5946 


541 


1666 


ILGSYSSIQPEEYS \SWC\EWLQDLLA\ YVSPK\HSYLRDLP ' 
SEGSPQRVNS IDFV\ EL\EHLQPDVLVHAVLR WDF/TI LTEAV 
YS YRG QKQ KKVMLTVEQAQDQHYAL VLWGPG AAW \ Y PQLQRKKG 
YI WE FKYIiFVQCNYTLENLELHTTP WS SCE CLFDDD IRAIT FKA 
KFQKSAPS FVKISDLATHLEDKCSGWLIKAQI SELAFP ITASQ 
KIALNAHSSIjKSIFSSLPNIVYTGCAKCGLEI.ETDENRIYXQCF 
SCLPFTMKKIYYRPALMTAIDGRHDVCIRVESKLIEKILLNISA 
DCI^VIVPSSEITYGMVVADLFHSLLAVSAEPCVLKIQSLFVL 
DENSYPLQQDFSLLDFYPDIVKHGANARL 


594 7 




1317 


RG I PDRRRRG P TOPVl^nT.PNKUK'ifiSTSr'T^wpnr'wr'iv'DrT" votrSr* — 

CEGFELHFWRKICRNC\NVAKKSM/TVLLSNEEDRKVGKLPEDT 
KYTTL I AKLKSDGI PMYKRNVM ILTNP VAAKKNVS INT VTYEWA 
PPVQNQALARQYMQMLPKEKQPVAGSEGAQYRKKQLAKQLPAHD 
ODPS KCHELS PREVKEMBQFVKKYKSEALGVGDVKLPCEMDAQG 
PKQMNI PGGDRSTPAAVGAMBDKS AEHKRTQYS CYCCKLSMKEG 
D PAI YAERAG YDKLWH P AC F VCSTCHE LL VDM I Y FWKNE KLY CG 
RHYCDSEKPRCAGCDEIiI FSNEYTQAENQNWHLKHFCCFDCDS I 
I^EIYVMVNDKPVCKPCYVKNHAVVCQGCHNAI DPEVQRVTYN 
NFSWHASTECFLCSCCSKCLIGQKFMPVEGMVFCSVECKKRI^IS 


594B 


39 


3370 


YRERYPVSGGSVLRSALEVCWDFLSGLTEGSLLPEGFFSGPIDQ 
GNH YQMRRKGRCHRGSAARHPSSPCS VKHS PTRETLTYAQAQRM 
VEIEIEGRLHRIS I FDPLE I ILEDDLTAQEMSE CNSNKENSER P 
PVCLRTKRHKNNRVKKKNEALPSAHGTPASASALPE PKVRIVEY 
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SEQ 
ID 
NO: 


j Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal penti'd^ 
<A=Alanine, C=Cysteine, D=*Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L^Leucine, M«Methionine, N=Asparagine, 
P=Proline, Q«Glutamine, RsArginine, 
S»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknovn, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








S PPS APRRP P VYYKF I E KSAE ELDNE VE YDMDEEDYAWIjE I VNE 
KRJCGDCVPAVS QSMFE PLMDRFEKE SHCBNQKQGEQQS L IDEDA 
VCCICMDGECQNSNVILPCDMCNLAVHQECYGVPYIPEGQWLC/ 
RAHCLQSRARPADCVLCPNKGGAFKKTDDDRWGHV\VCALW\ I p 
E\VGFANTVFrEPIIX3VRNIPPARWKLT\a7LCKEKGR/VGACI 
QCHTCANCYTAFHVTCAQKAGLYMKME P^KELTGGGTT PS VRKTA 
YCDVHTPPGCTRRPLNIYGDVEMKNGVCRKESSVKTVRSTSKVR 
KKAKKAXKALAE PCAVLPTVCAP YI P PQRLNR IANQVAIQRKKQ 
FVERAHSYWLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 
KAAKE KLK YW QRIiRHD LERARL L I ELLR KREKLKREQ VKVEQVA 
MBLRLTPLTVLLRSVLDQIiQDKDPARIFAQPVSLKEVPDYIjDHI 
KHPMDFATMRKRIiEAQGYKNLHE FE EDFDL I IDNCMKYNARDTV 
fyraavrlrdqggwlrqarrevds IGLEBASGMHLPERPAAAP 
RRPFSWEDVDRLLDPANRAHLGLEEQLRELLDMLDLTCAMKSSG 
SRS KRAKLLKKE I ALLRNKLSQQKSQPLPTGPGLEGFEEDGAAL 
GPEAGEEVLPRLETLLQPRKRSRSTCGDSEVEEESPGKRLDAGL 
TKGFGGARSBQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 
FNAPKCGRGKPALVRRHTLEDRSELISCIENGNYAKAARIAAEV 
GQSSMWISTDAAAS VUSPUCWWAKCSGYPSYPALI IDPKMPRV 
PGHHNGVTIPAPPLDVLKJGEHMQTKSDEKLFLVLFFDNKRSWQ 

WLPKSK>1VPLGIDETIDKL3KMMEGRNSSIRKAVRIAFDRAMNHL 
SRVHGEPTSDLSDI D 


5949 
5950 


39 


3370 


yrbrypvsgosvi.rsalbVcwdflsgltegsllpegffsgpidq'" 

GWHYQMRRKGRCHRGSAARHPSSPCSVXHS PTRETLTYAQAQRM 

veibieorlhrisifdpleui,eddltaqemsecnsnkbnserp 

PVCLRTKRHKNimVKKKNEAI»PSAHGTPASASALPEPKVRIV3Y 

SPPSAPRRPPVYYKFIEKSAEELDNEVEYDMDEEDYAWLEIVNE 

KRKGDCVPAVSQSMFEFIjMDRFEKESHCENQKQGEQQSLIDEDA 

VCC1 CMDGECQNSNVI LFCDMCNLA VHQECYGVP YI PEGQWLC / 

RAHCLQSRARPADCVLCPNKGGAFKKTDDDRWGHV\ VCALW\ IP 

E\VGFANWFIEPIDGVRNIPPARWKLT\CNLCKEKGR/vaACI 

QCHKANCYTAFHVTCAQKAGLYMKMEPVKELTGGGTTFSVRKTA 

YCDVHT PPG CTRRPLNI YGDVEMKNG VCRKESS VKTVRSTS KVR 

KKAKKAKKALAEPCAVLPTVCAPYIPPQRIiNRIANQVAIQRKKQ 

FVERAHSYWLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 

KAAKEKLKYWQRIiRHDLERARLL IELLRXREKLKREQVKVEQVA 

MELRIiTPLTVUjRSVLDQLQDKDPAR I FAQPVSLKEVPDYLDHI 

KHPMD FATM RKRLEAQG YKNLHE FEED FDL» 1 1 DNCM K YNARDTV 

FYRAAVRJbRDQGGVVLRQARREVDSIGLEEASGMHLPERPAAAP 

PJIPFSWBDVDRLLDPANRAHI/SI^EQLRELLDMI^LTCAMKSSG 

SRS KRAKLUCKE IALLRNKLS Q QHSQ PL PTGPGLEG FEEDGAAL 

GPEAGEE VLPRLETLLQPRKRSRSTCGDS EVEEES PGKRLDAGL 

TNGFG<3ARSEQEPGGGIiGRKATPRRRCASESSISSSNSPLCDSS 

FNAPKOGRGKPALVRRHTLEDRSELI SC I ENGNYAKAAR I AAEV 

GQSSMWISTDAAASVLEPLKWWAKCSGYPSYPALIIDPKMPRV 

PGHHNGVTIPAPPLDVLKIGEHMQTKSDBKLFLVLFFDWKRSWQ 

WLPKSKMVPLGIDETIDKLKMMEGRNSS IRKAVRIAFDRAMNHL 

SRVHGEPTSDLSDI0 




1166 


373 


ESRS^TMSTSQPGACPCQGAASRPAILYAIXSSSI^VPRPRSR 
CLCRQHRPVQLCAPHRTCREAIJ5VLAKTVAFliRNLPSFWQLPPQ 
DQRRLLQGCWGPIiFLLGLAQDAVTFEVAEAPVPSILKKILLEEP 
SSSGGSGQLPDRPQPSIAAVQWLQCCLES FWSLELSPKE \ YACL 
<GPILFNPDVPGI^AASHIGHLQQEAHWVLCEVLEPWCPAAQGR 
LTRVLLTASTLKS I PTS LLGDLFFRPI IGDVD IAGLLGDMLLLR 


5951 


i43 r 


§449 1 

( 
3 
( 

< 


^NVKPSLLWQi,FKFSDKEEHEQNDS ISGKTGETG VEEM IATRK 
7EQDS KETVKLSHEDDHILEDAGSSDISSDAACTNPNKTENSLV 

^pscvdevtecnlelkdtmgiadktentlernkiepi/;yceda 
5snrqlestbfnksnlevvdtstfgpesnilenaicdvpdqnsk 
jlnaibstkieshetanlqddrnsqsssvsylesksvkskhtkp 
/ikskqnmttdapkxivaakywihsktecvnvksvkrntdvpes 

^NFHRP\nCVRJCKQIDKEPKIQSCNSGVKSVKNQAHSVLiqcrLQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted en3~ 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ocyuioni. uu*ii_ttxnii3g signal, peptide 
(A=Alanine, (^Cysteine; D=Aapartic Acid, £» 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, Idsoleucine, K-Lysine, 
L^Leucine, M=Methionine, N-Asparagine , 
P-Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
WoTryptophan, Y=Tyrosine, XoUnknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








DQTLVQ I F KPLTHSLS DKSHAH PGCLKE PHHPAQTGH VS HS S QIC 
QCHEO>QQQAPAMFCTNSHVKBEl^HPGVEHFKEEDKLKLKKPEKN 
LQPRQRRSSKSFSLDBPPLPIPDNIATIRREQSDHSSSPESKYM 
WTPS KQ CGF CKKPHGNRFMVGCGRCDDW PHGDCVGLS LS QAQQM 
GEEDKEWCVKCCAEEDKKTEILDPDTLENQATVEFHSGDKTME 
CB KLGLS KHTTNDRT KYI DDTVKHKVKI LKRESGEGRNS SDCRD 
NE I fCKWQLAPLRKMGQ? VLPRRS S EEKS E KI PKESTTVTCTGE K ' 
ASKPGTHEKQEMKKKKV\EKGVLNVHPAASASKPSADQ I RQSVR 
HS L KD I LM KRLTDSNL KVPEE KAAKVATK I BKELFS F FRDTDAK 
YKNKYRSLMFNLKDPKNNILFKKVLKGEVTPDHLIRMS PEEIiAS 
KELAAWRRRENRHTIEMIEKEQREVERRPITKITHKGEIEIESD 
APMKEQEAAME IQB PAANKS LEKPEGSEK\RXEEVDSMSKDTTS 
QHRQHLFDLNCKICIGRMAPPVDDLSPKKVKVVVGVARKHSDNE 
AESIADALSSTSNILASEFFBEEKQESPKSTFSPAPRPEMPGTV 
E VEST FLARLN fi w kgfinmp s vakfvtkay? vsgs PEYLTEDL 
PDSIQWGRISPO^IVWDYVEKIKASGTKEICVVRFTPVTEEDQI 
S YTLL FAY FS SRKR YG VAANNMKQVKDMYL I PLGATDKIPHPLV 
PFDGPGLELHRPNLLLGL I IRQKIiKRQHSACASTSH IAETPES A 
PPIALPPDKKSKIEVSTEEAPEEENDFFNSFTTVLHKQRNKPQQ 
NLQEDIiPTAVEPLMEVTKQEPPKPLRFI/PGVLIGWENOPTTIjEL 
ANKPLP VDD ILQSLLGTTGQVYDQX AQS VMEQNTVKB I PFLNEQ 
TNS K I E KTDNVE VTDGENKE IKVKVDNI S ES TDKSAE I ETS WG 
SSSISAGSLTSLSLRGKPPDVSTEAPLTNLSIQSKQEETVESKE 
KTLKRQLQEDQENNLQDNQTSNSSPCRSNVGKGNIDGNVSCSBN 
LtvASi i AKiFyriNljiuaJPRQAAGRSQPVTTfiESKDGDSCRNGEK 
HMLPGLSHNKEHIjTEQINVEEKLCS AEKNS CVQQSDNLKVAQNS 
PSVENIQTSQAEQAKPLQEDIIiMQNIETVHPFRRGSAVATSHFE 

vgntcpsefpsksitftsrstsprtstnfspmrpqqpnlqhlks 
sppgfpfpgppnfppqsmfgfpphlpppllpppgpg\fa\qnpm 

VPW P P W\ HIiP XgQPQRMMGPLSQASRY IGPQNFYQVKD I RRPE 
RRHSDPWGRQDQQQLDRPFNRGKGDRQRFYSDSHHLKRERHEKE 
WEQES E RHRRRDRS QDXDRDRKSREEGHKDKBRARLSHGDRGTD 
GKASRDSRNVDKKPDKPKSEDYEKDKEREKSKHREGEKDRDRYII 
KDRDHTDRTKSKR 


5952 


322? 


639 


PPARRSARDIiPRALSMEAARPSGSWNGA1XUUiL\LVTI.\AFLIF 
ASDACKNVTLHVPSKI^AEiCLVGRVNLKECFTAANIilHSSDPDF 
QILEDG3VYTTNTILIjSSEKRSFTILI*SNTBNQEKKKIFVPLEH 
CTTKVIjKKRHTKEKVLRRAKRRWAPIPCSMLENSLGPFPLFLQQV 

qsdtaqnytiyysirgpgvdqeprnlfyverdtgnlyctrpvdr 

EQYES FEI IAFATTPDGYTPELP LPL 1 1 KI EDEMDN YP I FTEET 
YTFTIFENCRVGTTVGQVCATDKDEPDTMHTRLKYSI IGQVPPS 
PTLFSMHPTTGVITTTSSQLDRELIDKYQLKIKVQDE^DGQYFGL 
QTTSTCI INI DDVNDHLPTFTRTSYVTS VEENTVDVB ILRVTVE 
DKDLVNTANWRANYTIIJCGNE1K3NFKIVTDAKTNBGVLCVVKPL 
* ""^w iaajv^vj v vncuxtrr a luuioJriConno lAi VTVNVEDQDE 
GPECNPPIQTVRMKENAEVGTTSNGYKAYDPETRSSSGIRYKKL 
TDPTGWVTIDENTGSIKVFRSLDREAETIKNGIYNITVLASDQG 
ORTCTGTLGIILQDVNDNSPFIPKKTVIICKPTMSSAEIVAVDP 
DEP IHGPPFDFSLESSTSE VORMWRLKAINDTAARrj? ynvnoo t? 

GSYWPITVRDRXiGMSSVTSLDVTLCDCITENDCTHRVDPRIGG 
GGVQLG Ktf AILAI LLG I ALFFCI LFTLVCGASGTSKQ PKVI PDD 
I*AQQNL I VSNTEAPGDDKVYS ANGFTTQTVGASAQGVCGTVGSG 
I KNGGQBTI EM VKGGHQTSES CRG AGHHHTLDS CRGGHTEVDNC 
RYTYSEWHS FTQPRLGEES IRGHTXiIKJT 


5953 


330 


811 


PLLCNPDPGWYWWVKQESEISKESQEMDARPKIiDLGFKEGQTIK 
LCIGNITNiOCGGASKPRTARGGGLSLLPPPPGGKVTI PPPSS / V 
KliPSTNHVTPPSlPKSNHGGSDADILLDLiDSPAPVTTPAPTPVS 
V5NDLWGDFSTASSSVPN0APQPSNWVQF 


5954 


32 


2130 


P?PPPPKIANMADLEAVIJU>VSYI/^EKS^ 

P EPS I RS VMQKYLAE RNE ITFD XI FNQ KIGFLLFKDFCLNB INE 

^VPQVKFYEEIKEYEKLDNEEDRLCRSRQIYDAYIMKELLSCSH 
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SEQ 
ID 

NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


(A«Alanine, CoCysteine, D=*Aspartic Acid] E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H°Histidine, I=Isoleucine, KeLyeine, 
L=Leucine, M^Methionine, N*Asparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S«Serine, T«Threonine, V= Valine, 
WoTryptophan, Y-Tyrosine, X "Unknown, *=Stap 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








PFS KQ AVBHVQSHLS KKQ VTSTLFQPY I E E I CESLR(*b I FQKFM 
BSDKPTRFCQWKNVELNIHLTMNEFSVHRI IGRGGPGEVYGCRK 
ADTGKMYAMKCLNKKR I KM KQGETLAUfER IMLS LVS TGDCPF I 
VCMTYAPHTPDKLC FI LDLMNGGDLHYHLS OHGVPQ rttt?md T7v» 
TB 1 1 LGLEHMHNRFVVYRDLKPANILLDEHGHAR IS \ DLGLACD 
FS KKKPHASVGTHGYMAP E VLQKGTAYDSSADWFSLGCMLFKLL 
RGHSP FRQHKTXDKHE I DRMTLTVNVBLPDTFS P ELKS LLEGLL 
QRD VS KRLGCHGGGSQBVKEHS FFKGVDWQH V YLQKYP P PL I PP 
RGSVNAADAFDIGSFDEEDTKGIKLLDCDQELYKNFPLVTSERW 
QQEVTETVYEAVNADTDKI EAR KRAKNKQLGHEEDYALG KDCIM 
HGYMUCU3NPFLTQWQRRYFYLFPNRLEWRGEGESRQNLLTMEQ 
ILSVEETQIKDKKCILFRIKGGKQFVLQCESDPEPVQWKKELNE 
TFKEAQRLLRRAPKFLNKPRSGTVELPKPSLCHRNSNGL 


5955 


1726 


444 


*\ivu«_ei *v*~im v \_ir ua. i roAl J/vi 1 fc. JjRliCGI*CRSGQE FADCRR 
PANRQDVLSGWINLPVLQLTKDPLKTPGRLDHGTRTAFIHHREQ 
VWKRCINI WRDVGLFGVLNEIANS EEEVFEWVKTASGWALALCR 
WASSLHGSLFPHLSLRSBDLIAEPAQVTNWSSCCLRVFAWHPHT 
NKFAVALIiDDSVRVYNASSTIVPSLKHRLQRNVASLAWKPLSAS 
VLAVACQSC ILI WTLDPTSLSTRPSSGCAQVLSHPGHTPVTS LA 
WA PSG GRLLSAS PVDAAI R VWDVS TETCVPL PWFRGGGV3WLLW 
SPDGSKILATTPSAVFRVWEAQMWTCERWPTLSGRCQTGCWS PD 
GSRLLFTVLGEPLI YSLS FPERCGEGKG\ ALBVQSQQRLWQI CL 
RQQYRHQMVRRGLGERLTPWSGTPVGNVWLCL 


5956 


1705 


139 


CiVUVRGARAMATVQEKAAALNLSALriS PAHK P PGFS VAQKP FGA 
TYVWSS I INTLQTQVEVKKRRHRLKRHNDCFVGSEAVDVI FSHL 

FEDSSCSLYRFTTIPNQDSQLGKENKLYSPARYADALFKSSDIR 
S ASLEDLWBNL5 LKPANS PHVNISATLS PQ VINEVWQEETI GRL 
LQLVDLPLLDSLLKQQEAVPKIPQPKRQSTMVNSSNYLDRGILK 
AYSDSQEDEWLSAAIDCS E YLPDQMWBI SR5 FPEQPDRTDLVK 
E LLFDAIGRYYSS RBPLLNHLSDVHNGIAELLVNGKTE IALEAT 
QLLLKIlLDFQNREEFRRLLYFMAVAANPSEFKLQKESDNR^rVVK 

RIFSKAIVDNKNLSKGKTDLLVLFL\MDHQKDVFKIPGTL\HKI 
VS \ VK \ LMAI ONGRDPMR DARV T VPAtP rnrnj nvoMvufit? vwnTmn 

LLNI^KT3^EDSKLSAKBKKK\LLGQFYKCHPDIFIEHFnD 


5957 


1479 


451 


ELQVAVAHiraJRWKPXTKR^ " 
GGNANATVTK\^KDVYALKKPYGVLYKKKNITRPFEDQTSLEFF 
SKKSDCSLFMFGSHNKKRPNNLVIGRMYDYHVLDMIELGIENFV 
SLKDI KNSKCPEGTKPMLI FAGDDFDVTED YRRLKS LL I D FFRG 
PTVSNIRLAGLEYVI*HFTAI»NGKIYFR5YKLLLKKSGCRTPRIE 
LEEMGPSI^LVLPJ^THLASDDLYKLSMKMPKALKPKKKKNISHD 
TFGTTYGRI HMQKQDLS KLQTRKM \ KGLKKRPAER I T3DHE KKS 
KRI KKKLMELSQPLLFHCVLLKRI IKHQSIQSFL 


5558 


1 


3138 

I 
3 
I 


AAALGMLLWFPACQAFNLDVEKLT VYSGPKGSYFGYAVDFHIPD 
ARTASVLVGAPKANTSQPD I VEGGAVYYCP WPAEGSAQCRQI P F 
DTTNNRKI RVNGTKEP I EFKSNQ WFG\ATVKA\HKGKSCGPVAP 
LLFTWRNFLKPTPEKGPVGTCYVAIQNFSAYAEFS PCGNSNADP 
EGQGY CQAGFSLDFY KNGDLI VGG PGS PYWQGQVITASVADI IA 
NYSFKDILRKLAGEKQTEVAPAS YDDS YLGYSVAAGEFTGDSQQ 
ELVAG I PRGAQNFG YVS I INS YDMTFIQNFTGBQMAS YFGYTVV 
VS DVNSDGLDDVLVGAPLFMERE FESNPRE VGQI YLYLQVSSLL 
FRDPQILTGTETFGRFGSAMAHLGDLNQDGYNDIAIGVPFAGKD 
QRGJCVLIYNGNKDGLNTKPFPKFCQGVWASHAVPSGFGFTLRGD 
SDIDmDYPDLIVGAFGTGKVAVYRARPVVTVDAQLLLHPMI IN 
LENKTCQVPDSMTSAACFSLRVCASVTGQS IANT XVL MAEVQLD 
S LKQKGAI KRTLFLDNHQ AHRVFPLVI KRQKSHQCQDFI VYIiRD 
BTEFRDKLSPlNISLNYSLDESTFKEGLEVXPIRTYYRENrVSE 
3AHI LVDCGEDNLCVPDUCLSARPDKHOVI IGDENHLMLI XNAR 
^EGEGA YEAELFVM I PEEADYVG I ERNNKGFRPLSCEYKMENVT 
WWCDLGNPMVSGTNYSLGLRFAVPRLEKTNMS INFDLQ IRSS 
* KDN PDSNFVS LQINITAVAQVE I RGVSHP PQ I VLP IHNWEPEE 



412 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, B= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine. M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, RoArginine, 
S=Serine, T=Threonine, V=Valine, 
WoTryptophan, Y«Tyrosine, X«Unknown, ♦-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








EPHKEEEVGPLVEHI YELHNIGPSTISDTI LE VGWPFSARDE FL~ 
LYIFHIQTIiGPIiQCQPNPNINPQDIKPAASPEDTPELSAFLRNS 
TI PHLVRKRDVHWEFKRQSPAKILNCTNIECLQISCAVGRLEG 
GESAVLKVRSRLWAHTFl^RKNDPYALASLVSFEVKKMPYTDQP 
AKLPEGS1AIKTSVIWATPNVSFSIPLWVIILAILLGLLVLAIL 
TLALWKCG FFDRARPPQEDMTDREQLTNDKTPEA 


S95S 


1 


1166 


GTSGYAAQQLPSLLKERJEFHIiGTLNKVFASQWLNHRQVVCGTKC 
NTLFWDVQTSQITKIPILKDREPGGVTQQGCGIHAIELNPSRT 
LLATGGDNPNSLAIYRLPTLDPVCVGDBGHKDWIFSIAWISDTM 
AVSGS RDGS MGLWE VTDDVLTKS DARHNVS RVPVYAH I THKALK 
DI PK^DTNPDNCKVRAI^FNNKNKELGAVSLDGYraLWKAENTL 
SKLLSTKLP YCRBNVCLAYGSBWSVYAVGSOAHVS FLDPRQPS Y 
NVKS VCSRERGSGIRS VS F YEHI ITVGTGQGSLLF YD I RAQR FL 
EERLSACYGSKPRLAGENLKLTTG\KGWLNHDETWRNYFSDIDF 
FPNAVYTH C YDS 3 GTKL FVAGGPLPSGLHGNYAGLWS 


5960 


2B53 


870 


FVWSDGGPRPRRGPAVGAGAAHLSDPWAMTPGTANRATNPLNKE 
LDWAS INGFCEQIiNEDFEGPPLATRLLAHKIQS PQEWEAIQALT 
VLErCMKSCGKRFHDBVGKFRFLNELIKWSPKYLGSRTSEKVK 
NKILELLYSWTVGLPEEVKIAEAYQMLKKQG\IVKSDPKLPDDT 
TFPLPPPRPKNVI FEDEEKSKMLARLLKSSHPEDLRAANKLIKE 
XVQEDQKRMEKISECRVNAIEEVNNNVKLLTEMVMSHSQG3AAAG 

Q QT?r\T A MKT?tA YnOPPDMD DTT.?iypr>Dim<i*QmmV vat orr ^->«> 
aociitu u\i UKUSKHKr I ucr IuR VDTEDND\JBALiAEILQA 

NDNLTQVINL YKQLVRGEEVNGDATAGS I PGSTS ALLDLSGLDI. 
P PAGTTYPAMPTRPGEQAS PEQPSAS VSLLDDELMSLGIiSD PTP 
PSGPSLDGTGWNS FQSSDATEPPAPALAQAPSMESRP PAQTSLP 
ASSGLDDLDJjLGKZTLLQQSLPPESQQVRWEKQQPTPRLTLRDLQ 
NKSSSCSSPSSSATSLLHTVSPEPPRPPQQPVPrELSLASITVp 
LES IKPSNILPVTVYDQHGFRILFHFARDPLPGRSDVLWWSM 
LSTAPQPIRNIVFQSAVPKVMKVKLQPPSGTELPAFNPIVHPSA 
ITQVLLLANPQKEKVRLRYKLTFTMGDQTYNEMGDVDQFPPPST 
WGSL 


5961 


198 


3147 

■r 


SGBPRPEPGNMATCIGEKI EDFKVGNLLGKGS FAGVYRAES IHT 
GLEVAIKMIDKKAMYKAGMVQRVQNEVKIHCQLKHPSILELYNY 
FEDSNYVYLVLEMCHNGEMNR YLKNRVKPFSENEARHFMHQI I T 
GMLYI^SHGILHRDLTLSNIJ^LTRNMNIKIADFGIATQLKMPHE 
KHYTLCGTPNYI S P E IATRS AHGLES D VWSLGCM FYTLL I GR P P 
PDTDTVKNTLNKWLAD YEMPTFLS I EAKDLIHQLLRRNPADRIi 
SLSSVLDHPFMSRNSSTKSKDLGTVEDSIDSGHATISTAITASS 
STSISGSLFDKRRLLIGQPLPNKMTVFPKNK5STDFSSSGDGNS 
FYTQWGNQETSNSGRGRVIQDAEERPHSRYLRRAYSSDRSGTSN 
SQSQAKTYTMERCHSAEMLSVSKRSGGGENEERYSPTDNNANI F 
NFFKEKTSSSSGSFERPDNNQALSNHLCPGKTPFPFADPTPQTE 
TVQQWFGNLQ INAHLRKTTE YDS IS PNRDFQGHPDLQ KDTS KNA 
WTDTKVKKNSDASDNAHSVKQQNTMKYMTALHS KPEIIQQECVF 
GSDPIjSEQSKTRGMSPPWG YONRTTjT? ^ TT<;PT»VAHRi,ifPTpr»va» 
KKAWS I LDSEEVCVBLVKEYASQBYVKEVLQI SSDGNTITI YY 
PNGG \RGF PLA \ DRPPS PT \DNTSR \ YS F\DNL P EKYWRK YQ YA 
SRFVQLVRSKSPKITYFTRYAKCILMENSPGADFEVWFYDGVKI 
HJCreDFIQVIEKTGKSYTLKSESEVNSLKEEIKMYMDHANEGHR 
IC^iALESIISEEERlCrRflAPFFPIIIGRKPGSTSSPKALSPPPS 
VDSNYPTRDRAS FNRMVMHS AAS PTQAP I LNPSM VTN3GLGL7T 
TASGTDISSNSLKDCLPKSAQLLKSVFVKNVGWATQ\LTSGAVW 
VQFNDGSQLWQAGVSS IS YTSPNGQ\ TTR\ YGENEKLPDYI KQ 
KLQCLSS ILLMPSNPTPNFH 


5962 


20 


2447 


RVCSS S ASTA5QAVMADAWEE tRRLAAD FQRAQ FAEATQRLS ER " 
NC IE I VNXLI AQ KQLE WHTLDGKEYI T PAQ I S KEMRDELHVRG 
GRVNIVDLQQVINVDLIHIBNRIGDIIKSEKHVQLVLGQLIDEN 
YLDRLAE EVNDKLQES GQVT I SELCKTYDLPGNFIiTQALTQRLG 
RI ISGHI DLDNRG VI FTEAFVARHKARIRGLFSAITRPTAVNSL 
IS KYGFQEQIiL YS VLEELVNSGRLRGT WGGRQDKAVFVPDI YS 
RTQSTWVDS FFRQNG YLEFDALSRIiGI PDAVS Y I KKRYKTTQLI* 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nuel 4 Ha 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A«Alanine, OCy-Bteine, D=Aspartic Acid, E=* 
Glutamic Acid, F°Phenylalanine, G=Glycine, 
HoHistidine, I»Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N^Asparagine, 
P«= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\»pos3ible nucleotide insertion) 








FLKAACVQQGLVDQVEASVEEAISSGTWVDIAPLLPTSLSVBDA 

AI llqqvmrafskqastvvfsdtvvvsekf\ indctel FRELMH 

Q KAEKEMKNNP VHLI TEEDLKQ I STLESVS TS KKDKKD ERR RKA 
TEGSGSMRGGGGGNAREYKIKKVKKKGRiCDDDSDDESQSSHTGK 
KKPEISFMPQDEIEDPXJUCHIQDAPEEFISELAEYLIKPLMKTY 
LBVVR5 VFMSSTTS ASGTGRXRTI KDLQEBVSNL YNNI RLFEKG 
MKFFADDTQAALTKHIiLKS VCTD I TNLIFNPLAS DLMMAVDDPA 
AITSBIRKKILSKLSEETKVALTKLHNSLNEKSIEDFISCLDSA 
AEACD IMVKRGDKKRERQ I L FQHRQALAEQL BCVTEDPAL I LHLT 
SVLLFQPSTHSMLHAPGRCVPQI IAFLNSKI PEDQHALLVKYQG 
LVVKQLVSQSKKTGQGDYPLNNELDKEQBDVASTTRKELQEIjSS 
SI KDIiVLKSRKS SVTEE 


5963 


62 


1130 


PWN PQDFPGNRGLMG \QKGE I Q P P \ GQQGKKGAPGNlP \GLMGSN 
GSPGQPGTPGSKGSKGEPGIQGMPGASGLKGEPGATGSPGEPGY 
MGLPG I QGKKGDKGNQGEKG I QGQ KG ENGRQG I PGQQG I QGHHG 
AKGERGEKGEPGVRGAIGSKGESGVDGLMGPAGPKGQPGDPGPQ 
GPPGLDGKPGRBFSEQFIRQVCTDVIRAQIjPVUiQSGRIRNCDH 

cls qhgs pg 1 pg ppgp i gpegprg l pglpgrdg vpglvgvpgrp 
gvrglkglpgrngekgsqgfgypgeqgppgppcpegppgiskeg 
ppgdpglpgkdgdhgkpgiqgqpgppgicdpslcfsviarrdpf 

RKGPNY 




3 


2147 


scrtrgrlsplqpreagssrgsrarsepprpggmeeacqvqttk"" 

RGDPHELRNIFIiQYASTEVDGERYMTPEDFVQRYLGLYNDPNSN 

pkivqliagvadqtkdglisyqeflafesvlcapdsmfivafql 
fdksgngbvtfenvkbifgqtiihhhipfnwdcefirlhfghnr 
kkhlnyteftqflqelqleharqafalkdksksgmisgldfsdi 
mvt 1 rshmltp fveenlvs aaggs i s hq vsfs y fnafns llnnm 
elvrkiystlagtrkdaevtkeefaqsairygqatpiibidiiiyq 
ladlynasgrltlad i er i aplaegalp ynlaelqrqqspglgr 
p i wlq iaes ayrftlgs vagavgatavyp idlvktrmqnqrgsg 
swgeiimyknsfdcfkkvlryegffglyrgc.ipqligvapekai 

KLTVKDFVRJDKFTRRIX5SVPLPAEVLAGGCAGGSQVI FTNPLEI 
VKIRLQVAGEITTG PRVSALNVLRDLGI FG LYKG AKACFLRD I p. T 
FS A I YF P VYAHCKLLLADENGHVGGLNliliAAGAMAG\ VPAASLV 
TPADVIKTRI^VAARAGQTTYSGWIDCFRKIL\RBEGPSAFWKG 
TAARVFRSSPQFG\VTLVTYELLQRGFYIDFGGLKPAGSEPTPK 
SRIADLPPANPDHIGGYRLATATFAG I ENKFGLYLPKFKSPS VA 
WQPKAAVAATQ 


5955 


1 


149B 


MVTHLYRFIiPTSNMAAKLRSLLPPDLRLQFWIiHARtQKCFLSRG' ' 
CGSYCAGAKASPLPGKMAMGLMCGRRELLRLLQSGRRVHSVAGP 
SQWLGKPLTTRIiLFPAAPCCCRPHYLFLAASGPRSLSTSAISFA 
EVQVQAPP WAATPS PTAVP EVAS GETFAD VVQTAAE QS FAELGL 
GSYTPVGLIQNLLEFMHVDLGLPWWGAIAACT7FARCLIFPLIV 
TGQREAARIlINHLPEIQKFSSRIREAKIiAGDHIEYYKASSEMAL 
YQKKHG I EL YKPL I LP VTQAPI F I S FF I ALRBMANLPVPS LQTG 
GLWWFQDLTVSDP I Y I L PLAVTATMMAVLELGAETG VQS SDLQ W 
MRNVI RMMPL I TLP ITMHFPTAVFMYWLSSNLFSLVQVS CLR 1 p 
AVRTVLKI PQRWHDLDKLPPREGFLES FKKGWKNAEMTRQLRE 
REQRMRNQI*EIiAARGPLRQTFTHNPLLQPGKDNPPNI PSS \SS S 
S SKPKSKYP WHDTLG 


59tt " 


102 


1925 


RSKQVMARLTKRRQADTKAIQHLWAAIE I IRNQKQIANIDRITK 
YMSRVHGMHPKETTRQLSLAVKDGLIVETLTVGCKGSKAGIEQE 
GYWLPGDEIDWETENHDWYCFECHLPGEVLICDLCFRVYHSKCL 
SDE FRLRDS S S PWQCPVCRS I KKKNTNKQEMGTYIiRFI VSRMKE 
RAIDLNKKGKDNKHPMYRJILVHSAVDVPTIQEICVNEGKYRSYEE 

F KADAQIjLLHNTVT fygadseqadiarmlykdtchel\delqlc 

KNCFYLANARPDNWFCYPCIPNHELDWAKMKGFGFWPAKVMQKE 
DNQVDVRP FGHHHQRAW I PS ENIQD I TVNIHRLHVKRSMGWKKA 
CDELELHQRFLREGRF WKS KNEDRGEEEAESS I S STSNEQLKVT 
QEPRAKKGRRNQS VE PKKEEPEPETEAVSSSQEI PTMPQPIEKV 
SVSTQTKKLSASS PRMLHRSTQTTNDGVCQSMCHDKYTKI FNDF 
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SEQ 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



-S9GT 



Tor 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



1925 



Amino acid segment co ntaining signal peptide " 
CAoAlanine, C= Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F-Phenylalanine, G=Glycine. 
H=Hietidj.ne, I=asoleucine, K=Lysine, 
LoLeucine, M=Methionine, N=Asparagine, 
P^Proline, Q^Glutamine, R=Arginine, 
S»Serine, T»Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 
KD RMKSDHKRBTEKV VRS ALE KL RS EMEEB KRQAVN KAVANMQG 



1288 



5969 



1126 



503 



~* v wmufi njitta cd»us c a jvxua VN KAVANMQG 

BMDRKCKQV1CBKCKEBPVEBIKICLAT1QHKQLIS0TKKKQWCYNC 
BEEAMYHCCWNTSYCSIKCQQEHWHA EHKRTCRRKR 

RSKQVMARLTKRRQADTKAI CjHLWAAIEIlRKQKQIANIDRITK " 

YMSRVKGMHPKETTRQLSl^VKDGLIVETLTVGCKGSKAGIEQE 

GYWLPGDEIDWETENHDWYCPECHLPGEVLICDLCFRVYHSKCL 

SDEPRIJ?DSSSPWQCPVCRSIKKXNTNKQEMGTYIJIFIVSRMKE 

RAIDLNKKGKDNKHPMYRRIiVHSAVDVPTIQEKVNEGKYRSYEE 

FKADAQ LLLHNTVI FYGADSEQAD I ARMLYKDTCHEL \ DELQLC 

KNCPYLANARPDNWFCYPCIPNHELDWAKMKGFGFWPAKVMQKE 
DNQVDVRFFGHHHQRAWIPSENIQDITVNIHRLHVKRSMGWKKA 
CDEIjELHQRFIjREGRPWKSKNEDRGEEEAESS isstsneqlkvt 
QEPRAKBXSRPJTQSVEPKKEEPEPETEAVSSSQEIP^PQPIEKV 
SVSTQTKKLSASS PRMLHRSTQTTNDGVCQSMCHDKYTKIFNDF 
KDRNKSDHKRETERWREALEKLRS emeeekrqavnkavanmqg 
EMDRKCKQVKE KCKEB FVEE I KKLATQHKOL I SQTKKKQWCYNC 
EEEAMYHCCWWTSYCSIKCQQEHWHAEHK RTCRRKR 

vrfprrgc^ptvltpgrqqgvkjUspqrpgsepdipargqphpp 

RPVGVSTSAQAQVQPPAMHRRRLALGLGFCLLAOTSLSVLWVYL 
ENWLPVSYVPYYLPCPEIFNMKLHYKREKPLQPVVWSQYPQPKL 
LEHRPTQLLTLTPWIaAPIVSEGTFNPELIiOTIYQPIjNLTIGVTV 
PAVGN/HPLESABEFFKRG YRVHYYI FTDN PAAVPGVPLG PHRL 
LSSIPIQGHSHWEETSMRRMEriSQHIAKRAHREVDYLFCLDVD 
MVFRNP WGPETLGDLVAAI H PS YYAVPRQQF P YER RR VS TA FVA 
DSEGDFYYGGAVFGGQVARVYEFTRGCHMAILADKANG IMAAWR 

EESHLNRHFXSNKPSKVIiSPEYLWDDRXPQPPSLKLIRFSTLDK 
DISCLRS 



UVGFWIKRKR^DVFliESPR KP5GRPJ3RAPEK0RRIAANKCLC 
TGVREGEPPS/TTSQKVKEAGRDFTYLIVVLFGISirGGLFYTI 
FKELFSSSSPSKlyGRALEKCRSHPEVIGVFGESVKGYGEVTRR 
GRRQHVRFTEYVKDGLKHTCVKFYIEGSEPGKDGTVYAQVKENP 
GSGEYDFRYIFVEIES YPRRTI I I EDNRSODD 



SQDNIGHRXI^KHGWKI^^ k^UjGRTDPlPIVVKYDVMGMG 
RMEMKJJDYAEDATERRRVLEVEKEDTEELRQKYKDYVDKEKAIA 
KALEDIJ2AMFYCELCDKQYQKHQEFDNHINSYDHAHKQRJ J KDLK 
QRE FARNVS S RS RKDE KKQE KALRRLHEIiAEQRKQ AECAPGSG P 
MFKPTTVAVDEEGGEDDKDESATN3GTGATASCGLGSEFSTDKG 
G PFTAVQ 1 TNTTGLAQAPGItAS QG IS FG 1 KKNLGTPLQKLGVS F 
S FAKXAPVKLES I AS VFKDHAEEGTS EDGTKPDEKSS DQGLQKV 
GDSDGSSNLDGKKEDEDPQDGGSLASTLS KLKRMKREEGAGATE 
PEYYHYIPPAHCKVKPNFPFLLFMRASEQMDGDNTTHPKNAPES 
KKGSSPKPKSCIKAAASQGAEKTVSEVSEQPKETSMTEPSEPGS 
KAEAKKALGGDVSDQSLESHSQKVSETQMCESN3SKETSLATPA 
GKESQEGPKHPTGPFFPVLSKDESTALQWPSELLIFTKAEPSIS 
YSCNPLYFDFKLSRNKDARTKGTEKPKDIGSSSKDHLQGLDPGE 
PNKSKEVGQEKIVRSSGGRMDAPASGSACSGIiNKQEPGGSHGSE 
TEDTGRSLPS KKERSGKS HRHKKKK KHKKSS KH KRKHKADTEEK 

SPA 
RKH 



pprrrrraqddsqrrslpaeegssgkkdeggggsssqdhgg; 



DASSDQSCYSRQRSYSDDSYSDYSDRSRRHSKRSHDSDDSDYAS 
SKHRSKRHKYSSSDDDYSI^CSQSRSRSRSHTRERSRSRGRSRS 

sscsrsrskrrsrsttahswqrsrsysrdrsrstrspsqrsgsr 
krswghespeerhsgrrdfirsfciyrsqsphyfrsgrgegpgkk 

DDGRGDDSKATGPPSQNSNIGTGRGSEGDCSPEDKNSVTAKLLL 

ekiqsrkverkpsvseevqatpnkagpklkdppqgyfgpklpps 
i^nkpvlpligklpatrkpnkkceesglergebqeqseteegpp 
gs sdalfghqfp\seettgplldpppeesksgevtadhpvaplg 
ppahfdcylgdptishnylpdpsdgntlesldsssopgpvbssl 
lpiapdlehfpsyappsgdpsiestdgaeda\siaplesqpitf 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



ID 

NO: 



"§971 



S972 



5973 



53 



440 



Predicted end 
nucleotide 
location 
co rr e sp onding 
to first 
amino acid 
residue of 
amino acid - 
sequence 



~65~ 



17€1 



■ 2007 



Amino acid segment containing signal peptide"" 
(Alanine, (^Cysteine, D~Aspartic Acid, E= 
Glutamic Acid, F» Phenylalanine, G=Glycine, 
HaHistidine, I«Isoleucine, K*»Lysine, 
L=Leucine, M=>Methionine, N=Asparagine, 
PaProline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
K«Tryptophan, YaTyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion ) 
"fvEKflKK Y5 KLQQAAQQHI QQ QLLAKQ VKAF P AS AALAPAT PAL " 
QPIHIQQPATASATSITTVQHAILQHHAAAAAAAIGIHPHPHPQ 
PLAQVHHIPQPHLTPISLSKLTHSIIPGHPATPLASHPIHIIPA 
SAIHPGPFTFHPVPHAALYPTLLAPRPAAAAATALHLHPLLHPr 
FSGQDLQHPPSHGT 

SFLYFVGVDMDNPIGNWTORFTOVQLC SFACVBSTILLHINDII ' 
PESVTQERRPPKLAPMSRGVGDKGSSSHNKPKATGSTSDPGNRN 
RSSLFYTLNGSSVDSQPQSKSKNTWVIDEVAEDPAKSLTEISTD 
FDR3SPPLQPPPVNSLTTENRFHSLPFSLTKMPNTNGSIGHSPL 
SLSAQSVNBELNTAPVQESPPIiAMPPGNSHGLBVGSIAEVKENP 
P F YGVIRWI GQP PGLNEVLAGLELEDE CAG \ CTDGTP /REGTR Y 
FTCALKKALFVKLKSCRPDSRFASLQPVSNQIBRCNSLAIWEAY 
LSEVVEEKTPTQKWEXEGLBIMIG\KKKGIQGHYNSCYLDSTLF 
CLPAFSSVLDTVLLRPKEKNDVEYYSETQEIiLRTE I VNPLR I YG 
YVCATKIMKLRKILEKVEAASGFTSEEKDPEEFIiNILFHHIIaRV 
B PLL KIRSAGQ KVQDC YF YQI FMEKNEKVGVPTIQQLLEWS FIN 
SNLKFAEAPSCLIIQMPRFGKDFKLFKKIFPSLELNITDLLEDT 
PRQCRICGGLAMYECRECYDDPD.T SAGKI KQFCKTCNTQVKLHP 
KR I^HKYNP VSIjPKDLPDWDWRHGCI PCQ2JMELFAVLCI ETSHY 
VAFVKYGKDDSAWLFFDSMADRDGGQNGFNIPOVTPCPEVGEYL 
KHS LEDLHS LDSRR I QG CARRLLCDAI YVPCTQS PTM3 L YK 
ILIiAGS PS PRDQCSQRQSSGGDKEljVTRGCTFSrAWS PSAMTQ 
E PFREELAYDRMPTLERGRQD PAS YAPDAKPSDLQLS KRL P PC F 
SHKTWFSVIMGSCLLVTSGFSLYLG^VFPAEMDYLRCAAGSCl 
PSA I VS FTVS RRNANV I PNFQI LFVS TFAVTTTCL I W FG CKL VL 
NPSAININFNLILLLIiLELLMAATVI IAARSSEEDCKKKKGSMS 
DSANILDEVPFPARVUCSYSWBVIAGISAVLGGI IALNVDDSV 
SGPHLSVTFFWILVACFPSA1ASHVAABCPNKCLVEVL1AISSL 
TSPLLFTASGYLSFSIMRIVEMFKDY?PAIKPSYDVIiLI*I»IiLLV 
LLLQA/ GPQHGHRHPVRALQGQCKAAGCILGHPERPAGAPGWGG 

GQEPPEGVRQGBSLESRRGAMGPVTPRRGNRVAAPSLAPGMETH 
NP 

NGDGKDLFGHI WAWRSNGI I S^JFRRSPHAGMAEDEPPAKS PKTG 
GRAP PGGAEAGEPTTLLQRLRGTISKAVQNKVEG I LQDVQKFS D 

ndklylylqlpsgpttgdkssepstlskebymyayrwirnhi.ee 

HTDTCLPKQSVYDAYRKYCESLACCRPLSTANFGKIIREIFPDI 
KARRLGGRGQSKYCYSGIRRKTLVSMPPLPGLDLKGSESPEMGP 
EVTPAPRDELVEAACALTCDWAERILKRSFSSIVEVARFLLQQH 
LISARSAHAHVLKAMGLAEEDEHAPRERSSKPKNGLENPEGGAH 
KKPERIiAQPPKDLEARTGAGP LARGE RKKS WES SAPGANNLQV 
NALVARLPLLLPRAPRSL1 PP I PVSPPILAPRLSSGALKVATLP 
LSSRAGAPPAAVPIINMILPTVPALPGPGPGPGRAPPGGLTQPR 
GTENRBVG IGGDQGPHDKGVKRTAEVPVS EASGQAPPAKAAKQD 
IEDTASDAKRKRGRPLKKSGGSGERNSTPLKSAAAMESAQSSRL 
PWETWGSGGEGNSAGGAERPGPMGEAEKGAVLAQG\QGDGTVSK 
GGRGPGS OHTKEAEDKI PLVPSKVSVI KGSRSQKEAFPLAKGE V 
DTAPQGNKDLKEHVLQS5LSQEHKDPKATPP 



2200 



I^LQMHTTSGRIHQAMVT^L NEDNESVTVEWIBNtiDTKGK\EID 
LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
TVVASIKNDPPS\RDNRWGSARARPSQFPEQFSSAQQNGSV\S 
DISPVQAAKKEFGPPSRRKSNCVlCEVElQiQEKRBKRRLQOQELR 
EKRAQDVDATNPNYEIMCM1RDFRGSLDYRPLTTADPIDEHRIC 
VCVR KRPLNKKETQMKDLDVTTI PSKD WWVHE PKQKVDLTRYL 
ENQTFT^FDYAFDDSAFNEMVYRFTARPLVETIFERGMATCFAYG 
OTGSGKTHTMGGDFSGKN0DCSKGI YALAARDVFLMLKKPNYKK 
LELQ V YATFFE I YSGKVFDLLNRKTKLRVLEDG KQQVQ WGLQE 
REVXCVEDVLKLIDIGNSCRTSGQTSANAHSSRSHAVFQIILRR 
KGKLHGKFSLIDLAGNERGADTSSADRQTRLEGAEINKSLLALK 
ECIRAI^RNXPHTPFRASKLTQVLRDSFIGEWSRTCMIATISPG 
MAS CENTLlrrijRYANRVXELTVD PTAAGDVRP IMHHP PNQI \DD 
LETQWGVGSS PQRDDLKiLCEQNEEEVSPQLFTFHEAVSQMVEN 
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SEQ 
10 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
araino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino apid Sfiomsnf rnnhaininn ^ _ _ _^ t« J IT—. 

(A=Alanine, C=Cysteine, D=Aspartic Acid, E° 
Glutamic Acid* P°Phenvlalani np f5=r;lurinp 
HoHistidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T*Threonine, V*Valine, 
W»Tryptophan, YoTyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








EEQWEDHRAVPQESIRWLEDEKALLEMTEEVDYDVDSYATQLE 
AILEQKIDILTELRDKVKSFRAALQEEEQASKQINPKRPRAI* 


5975 


4293 


2200 


lglqmhttsgrihqamvtslnedne5vtvewiengdtkgk\e id 
lesifsln?\ dl\ vpdgeiepsp \etppppassakvnkivknrr 
tv\asikndpps\rdnrwgsararpsqfpeqpssaqqngsv\s 

DISPVQAAKKEPGPPSRRKSNCVKEVBKLQEKREKRRLQQQELR 
E KRAQDVDATNPNYE I MCMI RD FRGSLD YRPLTTADP I DEHR I C 
VCVRiCRPI^NKKETQMKDIiDVITI PSKDVVMVHE PKQ KVDLTR YL 
ENQTFRFDYAFDDSAPNEMVYRFTAR PLVETT FE R GMA.TCFA YG 
QTGSGKTHTMGGDFSGKNQDCSKGIYALAARDVFIiMLKKPNYKK 
LELQVYATFFEIYSGKVFDWiNRKTKLRVLEDGKQQVQWGLgE 
REVKCVBDVLKLIDIGNSCRTSGQTSANAHSSRSHAVFQIILRR 
KGKLHGKPSLIDLAGNSRGADTSSADRQTRLBGABINKSLIiALK 
ECIRALGRNKPHTPPRASKLTQVLRDSFIGENSRTCMIATISPG 
MAS CBNTIiNTLRYANRVKELTVDPTAAGDVRP I MHHPPNQI \ DD 
LETQWGVGS S PQRD DL KLL CEQNEE BVS PQLFTFHEAVS QMVEM 
EEQWEDHRAVFQBS IRWLEDEKALLEMTE EVD YDVDS YATQ LE 
AILEQKIDXLTELRDKVKSFRAALQEEEQASKQINPKRPRAL 


5976 


20 


2949 

r 


VHHLHLTRVSVWNLDHLRIAQQMGIKTIiNI#VLG\LKRA\LBF 

P E VS WME V KD PNMKG AMLTNTG K. Y A I PTIDA\EAYAIGKKEKPP 

FLPEEPSSSSEEDDPIPDELLCLICKDIMTDAWIPCCGNSYCD 

E C I RTALL ES DEHTCPTCHQNDVS PDAL IAilKFLRQAVNNFKNE 

TG YTKRLRKQLPS P PPP I PPPRPLIQRNLQPLMRSP I S RQQDPL 

MIPVTSSSTHPAPSISSLTSNQSSLAPPVSGNPSSAPAPVPDIT 

ATVSISVHSEKSDGPFRDSDNKILPAAALASEHSKGTSSIAITA 

LMEEKG YQVP VLGTPS LLGQSLLHGQLI PTTGP VRINTARPGGG 

RPGWEHSNKLGYLVSPPQQIRRGERSCYRSINRGRHHSERSQP.T 

GGPSkPATPVFVPVPPPPLYPPPPHTLPLPPGVPPPQFSPQFPP 

GQP\PPAGYSV?PPGFPPAPANLSTPWVSSGVQTAHSNTIPTTQ 

APPLSRBEPYREQRRLKEEEKKKSKLDEFTNDFAKELMEYKK1Q 

KERRRSPSRSKSPYSGSSYSRSSYTYSKSRSGSTRSRSYSRSFS 

RSHSRS YSRS P P YPRRGRGKS RNYRSRSRSHGYHRSRSRS P P YR 

R YHSRS RS PQAFRGQS PNKRNVPQGETEREYFKRYREV P PP YDM 

KAYYGRSVDFRDPFEKERYREWERKYREWYBKYYKGYAAGAQPR 

PSANRENFSPERFr.PLNIRNSPFTRGRREDYVGGQSHRSRNIGS 

NYPEKJ^ARIX3HWQKDNTKSKEICESEWAPGIX3KGNfCHKKHRKRR 

KGEESEGFLNPELLETSRKSREPTGVESNKTDSLFVLPSRDDAT 

PVRDEPMDABS ITFKSVSEKDKRBRDKPKAKGDKTKRKNDGS AV 

S KKEN I VKP AKG P Q E KVDG \ D VRDLLDLNL \QLKKP KEE TP KDL 

* ^i-^n/ii_»t-i_LTLKi*ii\JVoJj \trr \&i^iijWyQK\TPRwKTSQRGKSE 

EGLFQRCQIRKANN 


5977 


1363 


1336 


fledrgqvijShfqclslhs inhilhpgagUaagpaVgw /reyl^t 
pvlkeskfketgvitpeefvaagdhlvhhcptwqwatgeelkvtc 
aylptgxqflvtknvpcykrckqme ysdeleai ieeddgdggwv 
dtyhntgitgiteavkeitlenkdnirlqixisalceeeededeg 
eaadnee ybe sgiiletdbatlidtrkiveackaktdaggenailq 

TRTYDLYITYDKYYQTPRLWLFGYDBQRQPLTVEHMYEDISQDH 
VKKT VT I ENHPHIiPPPPMCS VHPCRHAEVMKKI I ETVAEGGGE L 
GVHMYLLI FLKFVQAVI PT I EYDYTRHFTM 


5978 


160 


3213 


RDGARR WGGCQS PLTWAPG F YRRFDLATSGRRLRGQTAEPAGRQ; 
RPRREPEAMDEQSVESIAEVFRCFICMEKLRDARLCPHCSKLCC 
FSCIRRWLTEQRAQCPHCRAPLQtiRELVNCRWAEBVTQQLDTLQ 
LCSLTKHEENEKDKCENHHEKLSVFCWTCKFCCICHQCALWGGMK 
GGHTFKPLAE I YEQH VTKVNE EVAKLRKRLME LIS LVQEVERNV 
EAVRNAKDERVREI RNAVEMMIARLDTQLKNKLITLMGQKTSLT 
QETELLESLLQE VEHQLRSCS KSELISKS SEI LMMFQQVHRKPM 
ASFVTTPVPPDFTSELVPSYDSATFVLENFSTLRQRADPVYSPP 
LQVSGLCWRLfCVYPDGNGWRGYYLS VFLEIiS AGLPETS KYEYR 
VEMVHQS CNDPTKNI IREFASDFEVGECWGYNRFFRLDLIANBG 
YLNP.QNDTVILRFQVRSPTFFQKSRDQHWYITQLEAAQTSYIQQ 
INNLKERLTIELSRTQKSRDLSPPDNHLS PQNDDALETRAKKSA 
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|| SEQ 
ID 
j NO: 


"| Predicted 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

| sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptiHe ~ 
(A= Alanine, C=Cysteine, D^Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G==Glycine, 
H=Histidine, X«Isoleucine, K^Lysine, 
L=Leucine, M-Methionine, N=Asparagine , 
P-Proline, Q=Glutamine, RaArginine, 
S=Serine, T=Threonine , V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








CSDMLLER \GPYSAS WREAKEDEEDEEKIQHEDYHHELSDGDL " 

DLDLVYEDEVNQLDGSSSSASSTATSNTBENDIDEETMSGENDV 

ByNNMELEEGELMBDAAAAGPAGSSHGYVGSSSRISRRTHLCSA 

ATSSLLDIDPLILIHLLDLKDRSS I ENLWGLQPRPPASLLQPTA 

S Y S RKDKDQRKC^AMWRVPSDI* KPQLKRLKTQMAE VR CMKTDVKN 

TLSEIKSS5AASGDMQ7SLPSADQAALAACGTENSGRLQDLGME 

LLAKSSVANCYIRNSTNKKSNSPKPARSSVAGSLSLRRAVDPGE 

NSRSXGDCQTLSEGSPGSSQSGSRHSSPRALIHGSIGDILPKTE 

DRQCKALDSDAWVAVPSGLPAVEKRRKMVTLGANAKGGHLEGL 

0MTDLENNSBTGBLQPVLPEGA3AAPEEGMSSDSDIECDTENEB 

Q EEHTS VGG FHDSFM VMTQ PPDEDTHSS FPDGEQ XGPEOLS FNT 

DENSGR 


5979 


212 


3665 


LPDWiW YL.WLICLLAFGFAFLDTBVFVTGQSPTPS PTDAYLNASE " 
TTTLSPSGSAVTSTTTIATTPSKPTCDEKYANITVDYLYNKETK 
LFTAKLNVNENVE OGNNTCTNNBVHNLTEC KNASVS I S HNS CTA 
PI)KTLILDVPPGVEXVPVHCCS\QVEQPDSTIWLKWKNIETSTC 
DTQNITYRFQCGNMIFDNKEIKLENLEPEHEYKCDSEILYNSHK 
FTNASKIIKTDFGSPGEPOIIFCRSEAAHOGVITWNPPQRSFHN 
FTLCYI KETEKDC^NLDKNLIKYDLQNLKP YTKYVLSLHAYI T A 
KVQRNGSAAMCHFTTKSAPPSQVWNMTVSMTSDNSMHVKCRPPR 
DRNGPHERYHLEVEAGNTLVRNESHKNCDFRVKDLQYSTDYTFK 
AYFHNGDYPGEPFILHHSTS YNSKALIAFLAFLirVTS IALLW 
LYKIYDLHKKRSCNIXIEQQELVERDDEKQLMNVEPIHADILLET 
YKRKIADEGRLFLAE FQS I PRVFS KFP I KEARKPFNQNKNR YVD 
ILP YDYNRVEI#S EINGDAGSNYINAS YIDGFKEPRKYIAAQGPR 
DET VDDFWRMI WEQKATVI VMVTRCEEGNRNKCAEYWPSMEEGT 
RAFGBCCCKDLTKHKRCP\DYIIQKLNIVNKXEKATGRBVTHIQ 
FTS WPDHGVPED PHLLLKLRRRVNAFSNFFSGP I WHCSAG VGR 
TGTYIGIDAMLEGLEAENKVDVYGYWKLRRQRCLMVQVEAQYI 
lilHQALVE YNQ FGE TEVNLS ELKP YLHNM KKRDP P S E PS PLEAE 
FQRLPSYRSWRTQHIGNQE\BNKSKNRNSNVIPYDYNRVPLKHE 
LEWS KESEHDSDES SBDDSDSEEPS K YI NAS FI MS YWKP \EVM I 
AAQGPIiXETIGD FWQMI FQRKVKVIVMLTELKHGDQE I CAQ YWG 
EGKQTYGDIEVDlJcDTDKS S TYTLRVFELRHSKRKDS RTVYQ YQ 
YTNWSVEQLPAEPKELISMIQVVKQKLPQKNSSEGNKHHKSTPL 
LIHCRDGSQQTGI FCALLNLLESAETEEWDIFQWKALRKARP 
GMVSTFEQYQFLYDVIASTYPAQNGQVKKNNHQEDKIEFDNEVD 

KVKQDANCVNPLGAPEKLPEAKEQAEGSEPTSGTEGPEHSVNGP 
ASPALNQGS 


5980 


3 


23*3 


DAWGCiCLRRLRFT YGTQTR VSLALPGQ YEL VHTIiVAHQGNWET t 
PEEDLB VQENNE DAAHDLTELE\TTMHHALIiQBVDVVVAP CQGLR 
PTVD VLGDLVNDFIiP VITYALHKDELS ERDEQELQE I RKYFS F P 
VFFFKVPKLGSEIIDSSTRRMESBRSPLYRQIiIDLGYLSSSHWN 
CGAPGQITrKAQSMLVEQSEKLRHI^TFSHQVLQTRLVDAAKALN 
LVHCHCLDI FINQAFDMQRDLQITPKRLEYTRKKENELYESLMN 
1 AKRKQEEMKDM IVETLNTMKE EL LDDATNMBFKDVI VPENGE P 
VGTREI KCCI RQ I QEL I I SRLNQAVANKL I SS VDYLRES FVGTL 
ERCLQSLBKSQDVSVHITSNYLKQILNAAYHVEVTFHSGSSVTR 
ML WEQ I KQ I IQR I T WVS PPAI TLEWKR KVAQEAIES LS AS KLAK 
SICSQFRTRLNSSHEAFAASLRQLBAGHSGRLEKTEDLWLRVRK 
DHAPRLARLSLESRSI^QDVLLHRKPKLGQELGRGQYGVVYLCDN 
WGGHFPC^KSVVPPDEKHWNDLALEFHYMRSLPKHERLVDLKG 
SVIDYNYGGGSSIAVLLIMERLHRDLYTGLKAGLTLETRLQIAL 
DVVEGIRFLHSO^LVHRDIKLKNVLI^KQNRAKITDLGFCKPEA 
MMSGS IVGTPIHMAPELFTGKYDNS VDVYAFGILFW YICSGSVK 
LPEAFERCA3KDHLV7NNVRRGAR PERLPVFDEECWQLMEACWDG 
DPLKRPLLGIVQPMLQGIMNRLCKS\NSEQPNRGLDDST 


5961 


1 


2519 


srkhsaamerpwgaadglsrwphglglllllqllppstLsqdrl 

DAPPPPAAPLPRWSGPIGVSWGLRAAAA\GGAFPRGGRWRRSAP 
3\EDEECORVRDFVAiOANNTHQHVFDDLRGSVSLSWVGDSTCV 
ILVLTTFHVPLVIMTFGQSKLYRSEDYGKNFKDITDLINNTFIR 
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| SEQ 

Tn 

NO: 


| Predicted ~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide *1 
{A^Alanine, C»Cysteine, D=Aspartic Acid, Eo 
Glutamic Acid r F* Phenylalanine, G=Glycine, 1 
H-Histidinc, I-Isoleucine, K»Lysine, 
L^Leucine, M=Methionine, N=Asparagine, | 

I P=Proline, Q^Glutaraine , ReArginine, 

I S^Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y«Tyrosine, XsUnknown, *-Stop 
Codon, /=possible nucleotide deletion, j 
\«possible nucleotide insertion) 


*982 "■■ 






TEPGMAlGPEKSGKVVLTAEVSGGSRGGRIFI^SDFAKNFVQroH 
LPPHPLTQMMYSPQNSDYLLAI/STENGLWVSKNFGGKWEEIHKA 
VCLAKWGSDNTIPFTTYANGSCKADLGALELWRTSDLGKSFKTI 
GVKIYSPtSI/SGIlFLFASVMADKDTTRRIHVSTDQGDTMSMAQLP 
SVGQEQFYS IIJVANDDMVPMHVDEPGDTGFGTI FTSDDRGIVYS 
KSLDRHLYTTTGGETDPTOVTSLRGVYITSVLSEDNSIQTMZTF 
DQGGRWTHLRKPBNSECDATAKNKNECSLHIHAS YS ISQKLNVP 
MAPLSEPNAVGI VXAHGSVGDAI S VMVPDVYTSDDGGYSWTKML 
EGPHYYTILDSGGIIVAIBHSSRPlNVIKFSTDEGQCWQrYTFT 
RD P I YFTGLASEPQARSMNI S I WGFTE S FLTSQWVS YT I DFKD I 
LERNCEEKDYTIWLAHSTDPEDYEDGCILGYKEQFLRLRKSSVC 
QNGRDYWTKQPS I CLCSLEDFLCDFGYYRPENDSKCVEQPELK 
GHDIjEFCL YGREEHLTTNGYRKI PGDKCQGGVNPVREVXDLKKK 
CTSNFL^PEKQNSKSNS\^IIIAIVGI^LVTVVAGVLI^<KYVC 

ggrflvhlysvlqqhXaeaNngvdgvdaldtashtnk^gyhdds 

DEDLLE 




56 


2316 


atrpprgsswcrqfsrtasaapgrsnmlripvrkalvglsksptH 

GCVRTTATAASNLIEVFVDGQS VTWEPGTTVLQACEKVGMQI PR 
FCYHERIiSVAGNCRMCLVEIElCAPKWAACAMPVMKGWNILTNS 
EKS KKAREGVMEFIjIiANHPLDCP icdqggecdlqdqsmmfgndr 
SRFLEGKRAVEDKNIGPLVKTIMTRCIQCTRCIRFASEIAGVDD 

lgttgrgndmovgtyiekmfmselsgniidicpvgaltskpyaf 
tarpwetrktesidvmdavgsniwstrtgevmrilprmhedin 
eewisdktrfaydglkrqrltbpmvrnekglltytswedalsrv 
aqmlqsfqgkdvaaiagglvdaealvalkdllnrvdsdtlcteb 
vfptagagtdlrsnyllntnagveeadwllvgtnprfeaplf 

NARIRKSWLHNDLKVALIGSPVDLTYTYDHLGDS PKILQDIASG 
SHP FS Q VLKEAXKPMWLGSS ALQRNDGAAI LAAVS S IAQKIRM 
TSGVTGDWKVMNILHRIASQVAALDLGYKPGVEAIRKNPPKVLF 
LLGADGGCITRQDLPKDCFIIYQGHHGDVGAPIADVILPGAAYT 
EKSATYVNTEGRAQQTKVAVTPPGLAREDWKI I RAL»SEI AGMTL 
PYDTL \ DQVRNRIiEEVS PNLVRYDDIBG\ANYFQQANBLS KL VN 

QQLLADPLVPPQLTMKpFWDSISRASQTMAKCVKAVTEGAQA.. 
VEEPSIC 


S983 
5984 


248 


1763 


eargdggrrrhrasgrragrgepVaglksqgqravpkravargg 

RQ\ YS AAIALLE PAGES IADDLS ILYSNRAACYLKEGNCSG C I Q 
IX^RAIiHI^PFSMKPIiLRRAMAYETLEQYGKAYVDYKTVLQIDC 
GLQIJu^SVNRLSRILMELDGPKWREKLSLIPAVPASVPLQAWH 
PAKEMISKQAGDSSSHRQOGITDEKTFKALKBEGNQC^NDKNYK 
DALSKYSECLKINFnCECAIYTNRAIiCYLKLCQFEEAXQDCDQAL 
QLADGNVKAFYRRALAHKGLKNYQKSLIDLNKVILLDPSIIEAK 
MELEE VTRLLNLKDKTAP FNKEKERR KI EIQE VNEGKEEPGR PA 
GEVSTGCLASEKGGKSSRSPEDPEXLPIAKPNNAYEFGQI INAL 
STRKDKEACAHLLAITAPKDLPMFXSNKI»ECa3TFI*LLIQSLKNN 
LIEKDPS LVYQHLLYLSKAER FKMMLTLI 8KGQKEL I EQLFEDL 
SDTPNNHFTLBDIQALKRQYEL 






755 


1193 


SSVCMACTYVgNI^KKQRSVSFIASGLMRVSTGPEIJ^HHSFVL "1 
TGDVGRRICRLLVGLFTKGDTSSKRVHPFSPGPCFLIiCDLARVG 
SS PK I NVS PFYQN \QTSTQRS CTVF VWQRCSLVGP FQ VTVFTM Y 
FHHSLRSISRFSSG 




5985 


22 


1408 

1 


RR VARPGTAE P AKARRTVRRGRARRDIAGAERKAGVSERGDS'gTtH 

RRPNPS IPSAAAGMSHIQ1PPGLTELLQGYTVEVLRQQPPDLVE 

tAVt,xiTXKbK£aU<APASVLPAATPRQSIiGHPPP 

GDS ES EEDEDLE VPVPSRFKRRVSVCAETYNPDEEEEDTDPRVI 

HPKTDBQRCRLQEACKDIIiLFKPnjDQEQljS QVLDAMFER I VKAD 

EHVIDQGDDGDNFYVI ERGTYDI LVTKDNQTRS VGQYDNRGS FG 

ELALM YNT P RAATI VATS EGS LWGXiDR VTFRRI I VKNNAXKRKM 

FESFIESVPIiLKSLEVSERMKIVDVIGEKIYKR/DGERIITQGE 

K\ADSFYI 1 2SGEVS ILIRSRTKSNKDGGNQEVEIARCHKGQYF 

SELA^VTNKPRAA^YAVGDVKCLVMDVOAFERIjLGP CMD IMKR 

^ISHYEEQLVKMPGSSVDLGNLGQ 
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Amino acio segment containing signal peptide 
in^uamne, uotysceine, D=Aspartic Acid, Ea 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H«Hietidine, Ioisoleucine, K= Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, i 
S=Serine, T=Threonine, V«Valine, 
Wo Tryptophan, Y=Tyrosina, X -Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5986 


1666 


484 


DAWKSTS LTPHWKLWGRHRGRRRGIAHPKNHLS PQQGGATPQ V P 
S PCCRFDS PRGPP PPRIjGLLGALMAEDGVRGS PP V PSGPPM BED 
V7jjj\n x irx^oAr AJl/lriyoGJUXjSCXIiPjjJGFGGQSGPEGERSI*APPDAS I 
LISNVCSIGDHVAQELFQGSDLGMAEEAERPGEK\AGQHSPLRE 
EHVTCVQS I LDEFLQT\ YGSL I PLSTDEWEKLED I FQ QE PST P 
SRKGLVI^LIQSYQRMPGNAMVRGFRVAyKRHVLTMDDIiGTLYG 
QNWLNDQVMNMYGDLVMDTVPBK\VHFPNSFFY\DKLRTKGYDG 
VKRWTKNVD I FNKELLIjIPIHLE VHWSLI S VDVRR RTI TYFDSQ 
RTLNRRCPKHIAKYLQAEAVKKDRLDFHQGWKGYFKMNVARQNN 

DSDCGAFVIiQYCKHLALSQPFSFTQQDMPKLRRQIYKELCHCKL 
TV 


5987 


1806 


484 


DAWKSTSLTFHWKLWGRHRGRRRGLAHPKNHLSPQQGGATPQVP 
SPCCRFDSPRGPPPPRLGLLGAIiMAEDGVRGSPPVPSGPPMEED 
GLRWTPKSPLDPDSGLLSCTLPNGFGGQSGPEGERSLAPPDASI 
LISNVCSIGDHVAQELFQGSDLGMAEEAERPGEK\AGQHSPIiRB 
EHVTCVQS 1LDEFLQT\YGSLIPLSTDEWE!CLEDIFQQEFSTP 
S RKGL VLQL I QS YQRM PGNAMVRGFRVA YKRHVLTMDDLGTLYG 
QNTOiNDQfVMNMYGDLVMDTVPEK^ 

VKRVrTKNVDIFNKELLL I PIHLEVHWS LIS VDVRRRTI TYFDSQ 

RTLNRRCPKHIJUCYLQAEAVKKDRLD7HQGWKGYFKMNVARQNN 

DSDCGAFVLQYCKHLALSOPFSFTQQDMPKLRRQIYKELCHCKL 
TV 


5988 


1292 1 


410 


FTOXYFLSFliGLIiESSHSRDRIHNLVLMFLIiATHNLVWWFTCRFQ'" 

RLDCIYLNAGIMPNPQLNIKALLFGLFS\AEGLLTQGDKITADG 

LQEVFETDVFGHFILIRELEPLLCHSDNPSQLIWTSSRNARKSN 

FSLEDFQHSKGKEPYSSSKYATDLI^VALNRNFNQQGLYSNVAC 

PGTALTNLTYG I LPPF I WTLIMPAI LLLRFFANAFTLTP YNGTE 

ALVWLFHQKPESLNPLIKYLSATTGFGRNYIMTQKMDLDEDTAE 

KFYQKLLELEKHIRVTIQKTDNQARLSGSCL 


59B9 


194 


2610 


A^FPQHSQHVLEQI^QQRQI^LLCDCTFVVDGVHFKAHKAVLA" 
ACSE YFKMIiFVDQKDWHLD I SNAAGLGQVLEFM YTAKLSLS PE 
NVDDVL\ AVATFLQMQD 1 1 TACHALKS LAEPATS PGGNAEALAT 
EGGDKRA KEEKVATSTLS RLEQAGRSTP IGPSRDLKEERGGQAQ 
SAASGAEQTEKADAPREPPPVELKPDPTSGMAAAEAEAALSESS 
EQEMEVEPARKGEEEQKEQEEOEEEGAGPAEVKEEGSQIiENGEA 
PEENENEESAGTDSGQELGSEARGLRSGTYGDRTESKAYGSVIH 
KCEDCG KE FTHTGNFKRH I R IHTGEKPFSCRECSKAFSDPAACK 
AHEKTHS PLKPYG CEECGKS YR L I SLLNLRKKRHSGEARYRCED 
CX3KLFTTSGNLKRHQLVHSGEKPYQCDYCGRSFSDPTSKMRHLE 
THDTDKEHKCPHCDKKFNQVGNIjKAHLKIH IADGPLKCRECGKQ 
FTTSGNLKRHLRIHSGEKPYVC IHOQRQFADPGALQRHVRIHTG 
BKPCX3CVMCGKAFTQASSLIAHVRQHTGE1CPYVCERCGKRFVQS 
SQIiANHIRHHDNIRPHKCSVCSKAPVNVGDLSKHI I IHTGEKP Y 
LCDKCGRGFNRVDNLRSIIVKTVHQGKAG I KILEPEEGSEVS WT 
VDDMVTIATEALAATAVTQLTVVPVGAAVTADErEVLIG^lS^ 
VKQ VQEED PNTHILYACDS CGDKFLDANS LAQHVRI HTAQALVM 

FQTOADFYO^YGPGGTWPAGQVliQAGELVFRPRDGAEGQPALAE 
TSPTAPECPPPAE 


5990 


2 


4700 


FGPGPDSGGGARGSGWGSRSQAPYGTLGAVSGGEQVLLHEEAGD 
SGFVSLSRLGPSLRDKDLEMEELMLQDETLLGTMQSYMDASLI S 
LIEDFGSLGEVEM3I,PDP3WDFSPPS FLETSS PKLPSWRPPRSR 
PRWGQSPPPQQRSDGEEEEEVAS FSGQILAGELDNCVSS I PDFP 
^4HLACPEEEDKATAAE^1AVPAAGDESISSLSELVRA^IHPYCLPN 
LTHLASLEDELQEQPDDLTLPEG CWL EI VGQAATAGDDLEI P V 
VVRQVSPGPRPVLIjDDSLETSSALQLLMPTLESETEAAVPiCVTL 
CS EKEG L SLNS EE KLDS ACLLKPREWEPWP KEPQNPPANAAP 
GSQRAR FCGR KKKS KEQ PAACVEG YARRLRSS SRGQSTVGTEVTS 
QVDNLQ KQPQEELQKESGP LQGKG KPRAW ARAW AAALENS S PKN 
LERSAGQSSPAKEGPLDLYPKIADTIQTNPIPTHLSIiVDSAQAS 
PMPVDSVEADPTAVGPVLAGPVPVDPGIjVDLASTSSELVEPLPA 
EPVLINPVLADSAAVDPAVVPISDmiPPVimVPSGPAPVDIiALV 
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amino acid 
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Amino acid segment containing signal peptide 
(A<*Alanine, C=Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PsProline, Q=Glutamine, R=Arginine, 
SnSerine, T=Threonine, VaValine, 
W=»Tryptophan, Y=Tyrosine, X-Un known, *«Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








D P VPNDLTP VDP VLVKS RPTDPRRGAVS S ALGGSAPQL LVES E S 
LDPPKTIIPEVKEWDSLKIESGTSATTHBARPRPLSIjSEYRRR 
RQQRQAETEKRS PQ VPIXSKWPSLPETPTGIiADIPCLVI PPAPAK 
KTALQRSPETPLEI CLVPVGPS PASPSPEP PVSKPVASS PTEQV 
PSQEMPLLARPSPPVQSVSPAVPTPPSMSAALPFPAGGLGMPPS 
LPPPPLQPPSLPLSMGPVLPDPFTHYAPLPSWPCYPHVSPSGYP 
CLPPPPTVPLVSGTPGAYAVPPTCSVPWAPPPAPVSPYSSTCTY 
G PIX3WGPGPQHAPFWSTVP PPPL P PASIGRAVPQ PKMES RGTPA 
GPPENVLPLSMAPPLSLGLPGHGAPQTEPTKVEVKPVPASPHPK 
KKVSALVQS PQM KALACVSAEGVTVEEPAS ERLKPETQETRPRB 
KPPLPATKAVPTPRQSTVPKLPAVH PARLRKLS FLPTPRTQGS E 
DWQAFISBIG I EASDLS SLLEQPEKSEAKKECP PPAPADSLAV 
GNSGGVDIPQEKRPLDRIiQAPELANVAGLTPPATPPHQLWKPliA 
AVS LIAKAKS PKSTAQEGTLKPEG VT2AKH PAAVRLQEG VHGPS 
RVHVGSGDHDYC\VRSRTPPKK\MPALLIPEVGSRWNVKRHQDI 
TIKPVLSLGPAAPPPPCIAASREPLDHRTSSEQADPSAP CLAPS 
SLLSPEASPCRNDMNTRTPPEPSAKQRSMRCYRKACRSA5PSSQ 
GWQGRRGRNSRSVSSGSNRTSEASSSSSSSSSSSRSRSRSLSPP 
HKRWRRSSCSSSGRSRRCSS9SSSSSSSSSSSSSSSSSRSRSRS 
PS PRRRSDRRRR YSS YRSHDHYQRQRVLQKERA I EERRWFIGK 
IPGRMTRSELKQRFSVFGEIEECTIHFRVQGDNYGFVTYRYAEE 
AFAAIESGHKLRQADEQPFDLCFGGRRQFCKRSYSDLDSNREDF 
DPAPVKS KFDS LDFDTLLKQAQKNLRR 


5991 


334 


1379 


RLSSHFSQCSPS IYC\TKFDKQGNVTS FBRKKTELYQELGIiQAR 
DLRFQHVMS ITVRNNRI I MRMEYLXAVTTP BCLL I LDY RNLNLK 
QWLFR3LPSQLSGEGQLVTYPLPFEFRAIEALLQYWINTLQGKL 
S ILQPLILETLDALGDPKHS S VDRSKXiHILLQNGKSLSELETD I 
. KIFKESILEILDEEELLEELCVSKWSDPQVFEKSSAG1DHAEEM 
ELLLEN YYRIiADDLSNAARJSLRVLIDDSQS I IFINLDSHRNVMM 
RLNLQLTMGTFSLSLFGLMGVAFGMNLESSLEEDHRIFWLITGI 
MFMGSGLIWRRLLSFLGR/ltARSSIASYGMKDMVHGGIVEGL 


5992 


2 


609 

T 


AGPDFRLVCGVSGSGFPGGRQGQATEWRPI^PWNGAMEKLRRVL 
SGQDDEEQGLTAQDSQINL/SEVLDASSLSFNTRLKWFAICFVC 
GVFFS XLGTGLLWLPGGI KLFAVF YtLgNLAALAS TCFLMGP VK 
QLKKMFEATRLLATIVMLLCFIFTLGAAIiWWHKXGLAVLFClLQ 
FLSMTW YSLS YIPYARDAVI KCCSSLLS 




1650 


594 


AEGLGS WAVWAGLGWAGRHMEAGGATGAIjGVG CKLPSAFCFPgIT"" 
SVAMDMFQKVEKIGEGTYGVVYKAKNR2TGQLVALKKIRLDLEM 
EGVPSTAIRE ISLLKELKHPNIVRLLDWHNBRlCLYIiVFEFIiSQ 
DLKKYMDSTPGSELPLHL I KS YLFQLLQGVS FCHSHRVIHRDUC 
PQNLLINEMAIIO^ADFGLARAFGVPLRTYTHEVVTLWYRAPEI 
LLATRFYTTAVDIWSIGCIPAEMVTRKALFPGDS\EIDQ\LFRI 
FRMLGTPSEDTWPGVTQLPDYRGSFPKWTRKGLEEIVPNLEPEG 
RDLLMQLLQYDPSQRITAKTALAHPYFSSPEPSPAARQYVLQRF 
RH 1 


5994 


394 

• 


1934 


AGBVQLHVWIRGMRIQPQ/ICAAAIIPLPPDFEPQSRPRSCTWPL" 
PRPEIANQPSKPPEVEPDLGEKVHTEGRSEPILLPSRLPEPAGG 
PQPGILGAVTGPRKGGSRRNAWGNQSYAELISQAIESAPEKRLT 
LAQI YEWMVRTVP YFKDKGDSNSSAGWKNS IRHNLSLHS KFIKV 
HNBATGKSSWWMLNPEGGKSGKAPRRRAASMDSS3KLLRGRSKA 
PKKKPSGLPAPPEGATPTSPVGHFAKWSGSPCSRNREEADMWTT 
FRPRSSSNASSVSTRLSPLRPBSEVLAEEIPASVSSYAGGVPPT 
LNEGLE LLDGLN LTS SHSL LS RSGLSGFS LQH PG VTGPLHT YSS 
SLPS PAEG PliSAGEGCFS SS QALEALLTSDTP P PPADVLMTQVD 
PILSQAPTI^I^LPSSSKLATGVGLCPKPLBAPGPSSLVPTL 
SM I APP PVMA5API PKALGTPVLTPPTEAASQDRMPQDLDLDMY 
MENLECDMDNI ISDLMDEGEGLDFNFEPDP 


5995 


2 


2437 


RPPG PGPASGAWLCTRARGS AAFVPPLP R PPSRGARRRRRLPGR 
GVAALRRGPGSAPGLPRGRAERSAAGSGRGPSREERGAAAAAAA 
AEMMEELHSL\DP\ RRQELLEARF\TGLG VSKGPLNS ESSNQSL 
CSVGS1*SDKEVETPEKKQNDQRNRKRKAEPYETSQGKGTPRGHK 



421 



WO 01/53312 



PCTAJS00/34263 



SEQ 
ID 
NO: 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
{AnAlanine, C«Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, F»Phenyl alanine, G=Qlycine, 
HoHistidinc, I«Isolcucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine , R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, XaUnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\opossible nucleotide insertion) 








ISDYPERRVEQPLYGLIXJSAAJCEATEEQSALPTLMSVMLAKPRiT - 
DTEQLAQRGAGLCFTFVS AQQNSPSSTGSGNTEHSCSSQKQI S I 
QHRQT\QSDLTIEKISALENSKNSDLEKKEGRIDDLLRANCDLR 

RQI \DEQQKMLEKYK\ erlnr cpdneprnfli eks kqe kmacr d 

KSMQDRIiRLGHFlTVRHGASFTEQWTDGYAPQNLIKQQERINSQ 
REEIERQRKMLAKRXPPAMGQAPPATNEQKQRKSKTKGAENETL 
TLAEYHEQEEIFKLRI^HLKKEEAEIQAELERLERVRNLHTREI, 
KRIHNEDNSQFKDHPTLNDRYLLLULLGRGGFSEVYKAPDXjTEQ 
RY VAVKIHQLNKNWRDEKKENYHKHACRBYRI HKBLDHPRI VKL 
YDYFSLDTDSFCTVTjEYCEGNDLDFYLKQHKLMSEKEARSIIMQ 

ivnalkylne i kpp i i hydlkpgnillvngtacgeikitdfgls 
kimdddsynsvdgmbltsqgagtywylppecfvvgxeppxisnk 
vdvwsvgvifyqclygrkpfghnqsqqdilqentilkatevqfp 
pkp wtpeakaf irrclayrke dr idvqqlacdpyllph i rks v 
stsspagaaiastsgasnnsssn 


599^ 


1612 


981 


DQQACLLGLMLTLE FG I LE FD P S W I GSWTQR / S W VS WRSRPGCB 
LFS I WFGS I VNBGYLNSASEGEEFCI YNRNPNACSYGVAVGVI, 
AFLTCIiLYLALDVYFPQISSVKDRKK\AVLSGHPWSGEPHPAA 
FWAFLWFTGDS CYL \ ANQWQ VS KP KDNPLN EGTDAS PGRPS PFS 
FFS I FTWS LTAALAVR R FKDLS FQE E YS TLFP \ ASAQ P 


5997 


1612 


981 


DQQACLLGLMLTLEFGILEFDPSWIGSWTQR/SWVSWRSRPGCE 
LFS IWFGS I VNEGYLNSASEGEEFCI YNRNPNACS YGVAVGVI, 
AFLTCLLYLALDVYFPQISSVKDRKK\AVLSGHPVVSGEPHPAA 
FWAFLWFTGDS CYL \ANQWQVS KPKDNPLNEGTDAS PGRPS PFS 
FFS I FTWSLTAALAVRRFKDLSFQEEYSTLFP\ AS AQP 


5998 


1612 


981 


DQQACl^LMLTLEFGILEFDPSWIGSW , rUR/sWVSWRSRPGCE r ~ 
LFS I WFGS 1 VNEGYLNSASEGEEFCI YNRNPNACS YGVAVGVL 
AFLTCLL YLALD VYFPQ I S S VKDRKK\AVLSGH P WSGEPHPAA 
FWAFLWFTGDS CYL\ANQ WQVS KP KDNPLNBGTDAS PGRP S PFS 
FFS I FTWS LTAALAVR R FKDLS FQEEYS TLFP \ AS AQP 


5999 


2 


1790 

■ r 


RP PME KARRGGDGVPRGP VLH I VWG FHHKKGCQVEFS Y PPLI P 
GDGHDSHTLPEEWKYLPFLALPDGAHNYQEDTVFFHLPPRNGNG 
ATVFG I SCYR \Q IEAKALKyRQADI TRETVQKS VCVLS KLPLYG 
LLQAKLQLITHAYFE^KDFSQISILKELYEHMNSSLGGASLEGS 
QVYLGLSPRDLVLHFRHKGLILFKL I LLEKKVLFYI SPVNKL VG 
ALMTVLSLFPGMIEHGLSDCSQYRPRKSMSEDGGLQESNPCADD 
FVSASTADVSHTNLGTIRKVMAGNHGEDAAMKTEEPLFQVEDSS 
XGQEPNDTNQYLKPPSRPSPDSSESDWETLDPSVLEDPNLXERE 
QLGSDQTNLFPKDS VPS BS LPITVQPQANTGQWLI PGLISGLE 
EDQYGMPLAI FTKGYLCLPYMAI/X)HHLI£DVTVRGFVAGATNI 
LFRQQKHLSDAI VEVEEAL IQ IHDPELRKLLNPTTADLRFADYL 
VRHVTENRDDVFLDGTGWEGGDEWIRAQFAVYIHALLAATLQLV 
LFR I VNVAKKI GNVMVTT \ SRNWQTGK\AVGQSVGG AFS \ SAK 
TA\MSS WLSTFTTS TSQSLTEP PDEKP 


■ rfooo 


101 


1561 


TEPCRTAENCTATMSENNKNSLESSLRQLKCHFTWNLMEGENSL 
DDFEDKVFYRTEFQNREFKATMCNLIiAYLKHLKGQNEAALECLR 
KABELIQQEHADQASIRSLVTWGtWAWVYYHMGRLSDVQIYVDK 
VKHVCEKFSSPYRI3SPELDCEEGWTRLKCGGNQNERAKVCFEK 
ALEKKPKNPEFTSGLAIASYRLDNWPPSQNAIDPLRQAIRLNPD 
NQYIJCVLIJU,KLHKMREEGEEEGEGEK\LVEEALEKAPG\VTDV 
LRSAA\ KFYRG KDE PDKAI ELLKKALEY I P \NNAYLHCQIGCCY 
RAKVFQVMNLRBNGMYGKRKLLELIGHAVAHLKKADEANDNLFR 
VCS1LASLHALADQYEDABYYFQKEFSKELTPVAKQLLHLRYGN 
FQLYQMKCEDKA1HHFIEGVKINQKSREKEKMKDKLQKIAKMRL 
S KNG ADSEALHVLAFLQELNE KMQQADE DS ERGLESGSLI P S AS 
SWNGB 


6001 


176 


1038 


AFAHS PSRGHRKTHIHTPRHTPRCTMAESHLQSSLI TASQFFE I 
WLHFDADGSGYLEGKELQNLIQELQQARKKAGLELS PEMKTFVD 
QYGQRDDGKIGI VELAHVLPTEENFLLLFRCQQLKS CE\EFMKT 
WRKYDTDHSGFlETEELKNFLKDLLEKAlfKTVDDTKLAEYTDLM 
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SEQ~~ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 

vUl 4C3 faJSJti y\ 1 FlM 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
K=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M»Methionine, N»Asparagine, 
P=*Proline, Q=Glutamine, R=Arginine, 
SoSerine, T=Threonine , V« Valine, 
WoTryptophan, Y= Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
WpossiJble nucleotide insertion) 








LKLFDS^IXJKLELTEMARLLPVQENFLLJa^QGIKMCGKEFNKA - 

PELYDQDGNGYIDENELDALLKDLCEKNKQDLDINNITTYKKNI 
MALSDGGKLYRTDLAIj I LCAGDN 


6002 


977 


81 


IAPPGGGLHIPPRTPJLtSHSRPPPSHHAPHPSPLPIiPPADLriPHS 
SMAQRSDLLEIiDCQLTRDR\AfVVSHDENLCRQSGLNRDVGSLDP 
EDLPliYKEICLEVYFSPGHPAHGSDRRMVRLBDriFQRPPRTPMSV 
EIKGKNEELIREQ/VLVRRYDRNEITIWASEKSSVMKKCKAANP 
EMPLSPTI SRG FWVLLS Y YLGLLP F I P I PEKFFFCFLPNI INRT 
yPPFSCSCLNQLLAWSKWLIMRKSLIRHLEERGVQWFWCLNE 
ES DF12AAFS VGATG VI TD YPTALRH YLDNHGPAARTS 


6003 
6004 


140 


4038 

r 


GKLRAFRGMRRLICKRI CDYKSFDDEESVDGNRPSSAASAFKVP~ 
APKTSGNPANSARKPGSAGG PKVGAGASKEGGAGAVDEDDFI KA 
Vl'DVPS IQI Y S SREIiEETLNKIRE IL SDDKHD WDQRANALKK IR 
SLLVAGAAQYDCFFQHLRLLDGALJCLSAKDLRSQWREACITVA 
KLSTVLGNKFDHGAEAI VPTLFNLVPNSAKVMATSGCAAIRPI 1 
RHTH VPRLI PL ITSNC7SK3 VPVRRRSFEFLDLLLQEWQTHSLE 
RHAAVL VE T I KKG I HDADAEAR VEAR KTYMGLRNB F PG EAETL Y 
NSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQBSLNRPFSS 
KWSTANPSTVAGRVSAGSSKASSLPGSLQRSRS DIDVNAAAGAK 
AHHAAGQSVRSGRLGAGALNAGSYASLEDTSDKLDGTASEDGRV 
RAKLS APLAGMGNAKADSRGRSRTKMVSQSQPGSRSGS PGRVXT 
TTAIiSTVSS GVQRVLVNS ASAQKRS KI PR6QG CSREAS PSRLS V 
ARSSRIPRPSVSQGCSREASRESSRDTSPVRSFQPLASRHHSRS 
TGALYAPEVYGASGPGYGISQSSRLSSSVSAMRVLNTGSDVEBA 
VADAIJjLGDIRTKKKPARRRYESYGMHSDDDANSDASSACSERS 
YS3RWGSIPTYMRQT\EDV\AEVLNRCASS^mSERKEGLW3LQN 
IiLKNQRTLSRVELKRLCEIFTRMFADPHGKRVFSMFLETLVDFI 
QVHKDDLQIWLFVLLTQLLKK>X^LLGSVQAKVQKALDVTRES 
FPNDLQFNILMRFTVDQTQTPSLKVKVAILKYI ETLAKQMD P GD 
FINSSETRLAVSRVXTWTTEPKSSDVRKAAQSVLISLFELNTPE 
FTMIiGALPKTFQDGATKLIiHNHLRNTGNGTQS SMGS PLTRPTP 
RSPANWeSPLTSPrNTSQNTLSPSAFDYDTENMNSBDIYSSLRG 

vteaiqnfsfrsqedmneplkrdskkddgdsmcggpg\msdpra 

G0DATDSSQTAL\DNKASLLHSMPTHSSPRSRDYNPYNYSDSIS 

pfnksalkeamfdddadqfpddlsldhsdlvabllkei^nhner 
veerkialyelmkltqeesfsviroehfktilu^etlx5dkept 

IRAIJUiK^LREILRHQPARFKNYASLTVMKTLEAHKDPHKEVVR 

SAEEAASV\IATSI\SPEQCIKVI^PII0^I7VDYPINLAAIKMQT 

KVIERVSKETLNI*I*LPEIMPGLIQGYDNSESSVRKACVFCliVAV 

HAVIGDELKPHLSQLTGSKMIG^NLYIKRAQTGSGGADPTTDVS 
GQS 




140 


4098 

; 
( 
j 
i 


GKIiRAFRGMRRLICKRICDYKSFDDEESVDGNRPSSAASAFKVP 
APKTS GNPANS ARKPGS AGGPKVGAGAS K EGGAGA VDEDDF I KA 
FTDVPSIQIYSSRELEETLNKIREILSDDKHDWDQRAMALKKIR 
SIJ^VAGAAQYDCFFQHLRI^GALia^SAKDLRSQVVREACITVA 
HLSTVLGNKFDHGAEAIVPTLFNLVPNSAKVMATSGCAAIRFI I 
RHTHVPRLI PI*ITSNCTSKSVPVRRRS FEFLDLIiLQEWOTHSIjE 
RHAAVLVET I KKG I HDADAEAR VEAR KTYMGLRNHFPGEAETLY 
NSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 
KffSTANPSTVAGRVSAGSSKASSLPGSLQRSRSDIDVNAAAGAK 
AHHAAGQSVRSGRUJAGLMJ^AGSYASLEDTSDKLDGTASEDGRV 
RAKLSAPLAGMGNAKADSRGRSRTKMVSQSQPGSRSGSPGRVLT 
± iniu a voasBvytt VJjVNtoASAQKRSKXPRSQGCSREAS PSRLSV 
ARSSRIPRPSVSQGCSREASRESSRDTSPVR5FQPLASRHHSRS 
TGALYAPB VYGASGPG YGISQS SRLSSS VS AWRVLNTGS DVEBA 
VADALLLGD IRTKKKPARRRYES YGMHSDDDANSDAS SACSERS 
5f SSRNGS IPTYMRQT\EDV\AB VLNRCASSNWSERKBGLLGLQN 
bLKNQRTLSRVELKRIiCEI FTRMFADPHGKRVFSMFLETLVDFI 
3VHKDDLQDWLFVLLTQLL IOXMGADLLGS VQAKVQKALD VTRES 
?PNDI^FNILMRFT\/DOTQTPSLKVKVAILKYIETLAKQMDPGD 
'INSSETRIAVSRVITimCTKSSDVRKAAQSVLISLFEIiNTPE 
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SEQ 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing Bignal peptide" 
(AoAlanine, C=Cysteine, D=*Aspartic Acid, Ed 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, MsMethionine, N»Asparagine, 
P-Proline, QoGlutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
WoTryptophan, Y^Tyrosine, X=Unknovn, *«Stop 
Codon, /^possible nucleotide deletion, 
\»possibIe nucleotide insertion) 



FTMIAGALF KTFQDGATKLLHNHlliRM^r GNGTQSSMGS PLTRPT^ ~ 
RSPANWSSPLTSPTNTSQNTLSPSAFDYDTENMNSEDIYSSLRG 
VTEAIQNPS FRSQEDMNEPIiKRDSKKDDGDSMOGGPG \ M3DPRA 
GGDATOSSQTAL\DNKASLLHSMPTHSSPRSRDYNPYNYSDS IS 
PFNKSALKEAMPDDDADQPPDDLSLDHSDLVAEtiLKELSNHNER 
VEERfCIAL YELMKLTQEESFS VWDEHFKTI LLLLLBTLGDKE PT 
IRAliALKVLREILRHQPARFKNYAELTVMJCTLEAHKDPHKEVVR 
SAEEAASV\IATSI\SPEO^IKVLCPIIQTADYPINLAAIKMQT 
KVIERVSKETLNLLLPEIMPGLIQGYDNSESSVRKACVPCLVAV 
HAVIGDELKPHLSQLTGSKMKLLNLYIKRAQTGSGGADPTTDVS 
GQS 



G005 



133 



5955 



RSSGRRQEQLGQFTGRERKGMASGI^SPSPCSAGSBEEDMDALir 
NNSLP PPHPBNEBD PEE DLSE TETPKLKKKKKPKKPRD PKI PKS 
KRQKKERMLLCRQLGDSSGEGPEFVEEEEEVALRSDSEGSDYTP 
GKKJCKKXLG PKKE K KS KS KRKEE EE EDDDDDDDS KEPKSS AQLL 
EDWGMED I DHVFS EED YRTLTNYKAPSQ PVRPLIAAKNPK1AVS 
KMMMVLGAKWR3FSTNNP FKGSSGAS VAAAAAAAVAWESMVTA 
TEVAPPPPPVEVPIRKAXTKEX3KGPNARRKPKGSPRVPDAKKPK 
PKKVAPLKI KLGGFG S KRKRSS SEDDDLDVESDFDDAS INS YS V 
SIX3STSRSSRSRKKLRTTKKIOCKGEEEVTAVDGYETDHQDYCEV 
CQQGGEI ILCDTCPRAYHMVCLDPDMEKAPEGKWSCPHCEKEGI 
QWEAKEDNSEGEEILBEVGGDLEEEDDHHMEFCRVCKDGGELLC 
CDTCPS S YH IHCLNP PL PE IPNGEWLCPRCTCPALKGKVQKILI 
WKWGQPPSPTPVPRPPDADPNTPSPKPLEGRPERQFFVKWQGMS 
YWHCSWVSELQI*ELHC\QVMFRNYQRKNDMDEPPSGDFGGDEEK 
S\RKRKNKDPKFAEMBERFYRYGIKPEW\MMIHRIIiNHSVDKKG 
HVHYLIKWRDIiPYDQASWESEDVEIQDYDLFKQSYWNHRELMRG 
EEGRPGKKLKKVKIjRKIiERPPETPTVDPTVKYERQPEYLDATGG 
TLHPYQMEGLNWLRFSVfAQGTDTILADBMGIKSKTVQriAVFLYSL 
YKEGHS KGPFLVSAPLSTI IN\WEREFEMWAPDMYV\ VTYVGDK 
DSRAI I R EKE FS \ FEDKAI RGGIGCASRMKKEAS VKFH VLLTS YE 
LI T I DMAI LGS I D WACL I VDEAHRLKNNQSKFFR VLNG YSLQHX 
LLLTGTPLQNNLEELFHLLNFLTPERFHNLEGFI*EE FAD I AXE D 
QIIOaaiDMIX3\PHMLRRLKADVFKKMPSKTBLIV\RVEIiSPM\Q 
KKYYK\YILHSKFIiKALNV^GGGNQVSLLNVVMDLKKCCNHPY 
LFPVAAMEAPKMPNGPTYDGSALIRASGKLLIJiQKMLKNL^ 
RVLIFSQMTKMLDIjLBDFLEHEGYKYERIDGGITGNMRQEAIDR 
FNAPGAQQFC FLLS TRAGGIiG INLATADTVI I YDSDWNPHNDIQ 
AFSRAHR IGQNKKVM I YR FVTRASVE ER I TQVAKKKMMI/THL W 
RPGLGSKTGSMSKQELDDILKFGTEBLFKDEATDGGGDNKEGED 
SSVIHYDDKAIERIXDRNQDETEDTELQGMNEYLSSFKVAQYVV 
REEBMGEEEEVEREIIKQEBSVDPDYWEKLLRHHYEQQQEDLAR 
NLGKGKRIRKQVNYNDGSQEDRDWQDDQSDNQSDYSVASEEGDE 
DFDERSEAPRRPSRKGLRNDKDKPLPPLLARVGGNIEVLGFNAR 
QRKAFLNAIMRYGMPPQDAFTTQWLVRDLRGKSEKEFKAYVSLF 
MRHLCEPGADGAETFADGVPREGLSRQHVLTRIGVMSLIRKKVQ 
EFEHVNGRWSMPEIAEVEENKKMSQPGSPSPKTPTPSTPGDTQP 
NTPAPVP PAEDG I K IE ENS LKEEES IEGEKEVKSTAPETAI ECT 
QAPAPASEDEKWVEPPEGEEKVEKAEVKERTEEPMETEPKGKG 
AADVBKVEEICSAIDLTPIVVEDKEEKKEEEEKKEVMLQNGETPK 
DUTOEKQKXNIKQRFMFNIADGG FTELHS LWQNEERAATVTKKT 
YEIWHRRHDYWLLAGI INHG YARWQD I QNDPRYAILNE P FKGEM 
NRGNFLEIKNKFLARRFKLLEQALVIEEQLRRAAYLNMSEDPSH 
PSMALNTR FAEVECLAESHQHLS KESMAGNKPANAVLHKVLKQL 
EELLSDMKADVTRLPATIARIPPVAVRLQMSERNILSRLANRAP 
EPTPQQVAQQQ 



965 



DNDFLRNTVHRHEPPVTAEPIRIiLAENEDVVVVDKPSSlPVHPC 
GRFRHNTVI FILGKEHQLKELHPLHRLDRLTSGVLMFAKTAAVS 
ERIHEQVRDRQLEKEYVCRVEGEFPTEEVTCKBPILWSYKVGV 
CR VDPRGKP CETVFQRLS YNGQSS WRCRPLTGRTHQ I R VHLQF 
LGHPI LND P I YNSVAWGPSRGRGG YI PKTNEELLRDLVAEHQAK 
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SEQ 
ID 

NO: 


Predicted 
begizming 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


* w ocymcnt containing signal, peptide 
fA=Alanine, C=Cysteine, DaAspartic Acid, B= 
Glutamic Acid f F«=Phenylalanine, G=Glycine, 
H=Hlstidine, I«lsoleucine K=l»vni n#» 
I>=»Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, TVThreonine, V=Valine, 
vr=Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\aposaible nucleotide insertion) 








QSLDVLDLCEGDLSPGIiTDSTAPSSELGKDDLEELAAAA\QKI-lE 
B VAE AAP Q ELDTI ALAS E KA VE TDVMNQ \ RQT \ TLCR V P AG ATG 
SLAPRPCDVPTCPTL 


6007 


3 


2351 


helgqveyvftdktotltenemqfrecsingHkyqeingrlvpe" 

GPTPDSSEGNIjSYLSSLSHIiNNLSHLTTSSSFRTSPBNETEIjIK 
RHDLFPKAVS LCHTVQ INNVQTD CTGDGP WQSNLAPS QLE YYAS 
SPDEKALVEAAARIGlVFIGNSEETMEVKTLGKLERYKLLHIIiE 
PDSDRRRMSVIVQAPSGEKLLFAKGAESSILPKCIGGEIEKTRI 

hvdefalkglrtlciayrkftskeyeeidkr:feartalqqr\e 
e klaav pqfi exd lillgatavedrlqd kvret i ealrmag i kv 

WVLTGDKHETAVSVSLSCGHFHRTMNrLELINQKSDSECAEQLR 
Uuakk A i ttDHV I QHGL WDGTS LSLALRBHEKLFMEVCRNCSAV 
LCCRMAPLQKAKVIRLIKISPEKP1TLAVGDGANDVSMIQEAHV 
GIG I MG KEGRQAARNSDYAI AR FKFLS KLLFVHGHFY Y IR I ATIi 
VQYFPYKNVCFITPQFIiYQFYCLFSQQTLYDSVYLTIiY \NI CFT 
SXiP ILI YSLIiEQHVDPHVIjQNKPTLYRDISKNRLLSIKTPI*YWT 
ILG FS HAFI PFPGS YLL IGKDTS LLGNGQMFGNWTFGTL VFT VM 
VITVTVKMALETHFWTWINHLVTWGS I IFYFVFS LPYGG ILWPF 
LGSQNMYFVF IQLLSSGSAWFAI ILM WTCLFLDI IKKVFDRKL 
HPTSTEKAQLTETNAGIKCLDSMCCFPEGBAACASVGRMLERVI 
GRCSPTHISRSWSASDPPYTNDRSILTLSTMDSSTC 


" soba 


4554 


1089 


AGVRRAGARRGPGRALPAGATAVPPPSARRRRRCPAPEHAGPAR 
ASRPSQETMFQLPVNNLGSLRKARKTV2CKILSDIGLEYCKEHIE 
DFKQ FEPNDFYLKNTTWEDVGLWDPSLTKNQD YRTKPFCCSAC P 
FSSKFFS AYK5HFRNVHS ED FENR I LLNCPYCTFNADKICrLETH 
I KI FHAPNASAPSS SLSTFKDKNKNDGLKPKQADSVEQAVYYCK 
KCT YRD PLYE TVRKHI YRBHFQHVAAPYIAKAGE KSLNGAVPLG 
SMAREESSIHCKRCIiFMPKSYEALVQPIVIEDHERIGYQVTAMIG 
HTNVWPRSKPLMLIAPKPQDKKSMGLPPRIGSLASGNV\RSLP 
SQQMVNRLSIPKPNLNSTGVNM>1SS 

GQSMRLGLGGHAPVSIPQQSQSVKQLLPSGNGRSYGLGSEQRSQ 
APARYSLQSANASSLSSGQLKSPSLSQSQASRVLGQSSSKPAAA 
ATGPPPGNTSSTQKV7KICTICNSLFPBNVYSVHFEKEHKAEKVP 
AVANYI MKIHNFTS KCL YCNR YLPTDTLLNHMLIHGLS CPYCRS 
TFNDVEIO^AAHMRMVHlDEEMGPKTDSTIiSFDLTLQO^SHTNIH 
LLVTTYNLRDAPAESVAYHAQNNPPVPPKPQPKVQEKADIPVKS 
S PQAAVP YKKDVGKTLCPLCFS I LKGP I SDALAHHIjRERHQV I Q 
TVHPVEKKLTYKCIHCIX3VYTSNWASTITLHLVHCRGVGKTQN 

S FFEEKPEEPWLALDPKBH\ EDDSYEARXSFLTKYFT \KQPYP 
TRREIEKLAASLWV\WK\SDIASHFSNKRKKCVRDCEKYKPGVL 
I^FNMKELNKVKHEMDFDAEGLFENHDEI03SRVNASK?ADKKLN 
LGKEDDSSSDSFENLEB3SNESGSPFDPVFEVEPKISNDNPEEH 
VLKVI PEDAS ESEE KLDQKEDGS KYETIHLTEEPTKLMHNASDS 
EVDQDDWEWKDGAS PSESGPGSQQVSDFEDNTCEMKPGTWSDE 
S S QS EDARS S K PAAKKKATMQGDREQLKWKNS £ YGKVEG FWS KD 
QSQ WXNASENDERLSNPQI EWQNSTIDS EDGEQFDNMTDGVAJS P 
MHGS LAGVKLS SOQA 


6009 


4272 


1534 

i 


CHGLQHLTPFRELNLSLQG*EPH*AA*QAVRSEEKSIC*GSPSC 
HL VLGVLVP VARQSSHS AG PAQSAFR *TGTGS GTPKAAEQSGYW 
EAYTLGHQHWNMFPIQRPPLVMKGRRIMCGKCEKG*VSDSVTGG 
RAVAGEQASQRRTVFTAGGGECLGAKSVRASVFTGNQPGVMGLL 
NGKRGGCFESGYLFGFIVIGKIQSLEAKVPLPVNGQTGERASPG 
NCRIHIVDAVC*SEHH*DHFLAAAFLENSTIIS*VAPGSWQDHA 
VLQKEVQASVRCRGFESVDTAPAGFWAHSPPGLQGEPTTTSVSL 
FVLAPQDGEGV PFVEGQLVTVLGLWPQS I RHTFVHHTQL PLHP 
I * KLGALDVAFLHLLTLVCS S FNVA YG * G KNGGTTLHQLFAEVN 
AVTRGSAVQRRPSITISSIHVDTKIQQELHDVMVAGADGWQWG 
DPFWGIAGIFHLIDDPLHQIELSFQRRV*EQCOGVKPDSQPVP 
RPLRVGLLQVGPLVRGGGRRVAGRGKRCWRDLLFPWRWGLSHRT 
RDIiLRGGDRGHVVVXVLCRLGSLVGGliGTDEIiLWFGGR * L 1 1 IG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


«-m»j.<iv o«.xu acyment containing signal pepts.de 
(A=Alanine, C«Cysteine, D~Aspartic Acid, E« 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H=IIistidine, I=Isoleucine, K=Lysine, 
L»Leucine, MaMethionine, N**Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SoSerine, T=Threonine, V=Valine, 
^Tryptophan, Y=Tyrosine, X=*Dnknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








I * * RGRLSGEWGCGLGRGBLFQVS IGIGVS I VHIGQGDHEVLGG ' 
AGL VERGALHATGQG VEAL VQQ LLDVGPAGALGLCDGAALFQG P 
GRVGOLPAEGIiQVCITLVAQWRMHDGRELGGAEWPWQALHGAAI 
CGVGGAILLKALSQYFLKGG*RLWCARGQ*PVKKRQRRWRG*TR 
R *MGLTIHCFN* LI *GAVCCRI*VILRWCGLLBVHGVYGT * IHCL 
GS FPGRLWP+ PFI SQERPNGHCQWE FRLA VP S W KCRWSR WRVRG 
TWRYGNPLLNLL*GAWLGGAACGGQQGGPLSTWQACTGPGQAAF 
LP P FQGACR PRTQRCRTWVCP IAWRQLLAYTRD 


6010 


1 


3533 

r 


' IMPCGSSRLLRGCWTHPNEPVSDL3YFDCIESVMEWS1CVLGESM"" 
AG I S QNAKTG D LPAFG ECVGIAS KA L CGLTEAAAQ AA YL VG I FD 
PNSQAGHQGLVDPIQFARANQAIQMACQNLVDPGSSPSQVLSAA 
T I VAKH TSAL CNACRIAS S KTAN P VAKRH PVQSAKEVANSTANL 
VKT I KALDGDFS EDNRNKCRI ATAPL I EAVENLTAFAS NFE F VS 
IPAQISSEGSQAQEPI LVSAKPMLES SS YLIRTARSLAINPKDP 
PTW S VLAGHSH7 VS DS I KS L I TS I RDKAPGQR E CDYS I DG INRC 
IRD I EQASLAAVSQ5LATRDDIS VEALQEQLTS WQE I GHLIDP 
IATAARGEAAQLGHKGTQLAS YFE PL I LAAVGVAS KILDHQQQM 
TVLDQTKTLAESALQML YAAKEGGGNPKAQHTHDAI TE AAQLMK 
EAVt)DIMVTr,NF.AASEVGLVGGMVDAJAEAMSKLLDEGTPPEPKG 
TFVDYQTTWKYSlCAIAVTAQEMMTKSVTNPBEIiGGLASOMTSD 
YGKLAFQGQMAAATAEPE E I G FQ IRTRVQDLGHGCI FLVQKAG\ 
ALQVCPTDSYTKRELIECARAVTBKVSLVLSALQAGNKGTQACI 
TAATA VS G 1 1 ADLDTTI M F ATAGTLNAENS BTFADHREN I LKTA 
KALVEDTKLLVSGAAS t pdklaqaaqs SAATI TQLAEVVKLGAA 
SLGSDDPETQWLINAI KDVAKALSDLISATKGAASKPVDDPSM 
YQLKGAAFCVMVTN VTS L LfCTVKA VED EATRGTRAL EAT 1 EC IKQ 
ELTVFQSKDVPE KTSS PBES I RMTKX3 ITMATAKAVAAGN5 CRQE 
D VIATANL SR KAVSDMLTACKQAS FHPDVSDE VRTRALRFGTEC 
TLGYLDLLEHVLVI^KPTPELKQQLAAFSKRVAGAVTELIQAA 
EAMKGTEWVDPBDPTVTAETELLGAAASIEAAAKKIiEQLKPRAK 
P KQADETLDFEBQ I LEAAKS I AAATSALVKSAS AAQRELVAQG K 
VGS I PANAADDGQ WS QGL I S AARMVAAATSSLCEAANASVQGHA 
S BEKLI SS AKQVAAS TAQIiLVACKVKftDQDSBAMRRLQAAGNAV 
KRASDNLVRAAQKAAFGKADDDDVWKTKFVGGIAQI IAAQEEM 
LKKERELEEARKKLAQIRQQQYKFLPTELRBDBG 


""6011 


446 


183S 


LLQPAMRKSPGLSbCLWAWILLLSTLTGRSYGQPSLQDELKDNT 
■ ** A^AiiwiviJijjjwijji>i*tu«jt*V7iA»r»itv l livivlUir VTob\5r'VSDH 
DMEYTIDVFFRQSWKDERLKFKGPMTVLRLNNLMASKIWTPDrF 
FHNGKKSVAHNMTMPNKLLRIT3DGTLLYTMRLTVR\AECPMAF 
GRDFPM\D\AHACPLKFGSYAYTRAEVVYEWTREPARSVVVAED 
GSRIiNQYDLLGQTVDSGIVQSSTGE^VVMTTHFHLKRKIGYFVI 
QTYLPCIMTVILSQVSFWIjNRESVPARTVFGVTTVLTMTTLSIS 
ARNSLPKVAYATAMDWFIAVCYAFVFSALIEFATVNYFTKRGYA 
WDGKSVWEKPKK\raPLIKKNNTYAPTATSYTPNIJUlGDPGlA 
T IAKSAT I E PKBVKPETKPPEPKKTFNSVS KIDRLSRIAFPLLF 
G I FNLVYWATYLNRE PQLKAPTPHQ 


6012 


351 


5013 


PAELFQSFAIWHKELYDWRLGPWNQCOPVI SKSLEKPLECI KGE " 
EGIQVREIACIQKDKDrPAEDirCEYFEPKPLLEQACLIPCQQD 
CIVS EFSAWS ECS KTCGSGLQHRTRHWAPPQFGGSGCPNLTEF 
QVCQS S P CEAEELR YS LHVG P WS TCSMPHSRQVRQARRRGKNKE 
RE KDRS KGVKD PEAREL I KKKRNRNRQNRQENKYWDI Q I G YQTR 
EVMCINKTGKAADLSFCQQEKLPMTFQSCVITKECQVSEWSEWS 
P CS KTCHDMVS PAGTRVRTRT IRQFPIGSEKECPB FEEKE PCLS 
QGIX5VVPCATYGWRTTEWTECRVDPLLSQQDKRRGNQTALCGGG 
I QTREVYCVQANENLLS QLS THKNKE AS KPMD LKLCTG P I PNTT 
QLCHIPCPTECEVSPWSAWGPCTYENCNDQQGKXGFKLRKRR1T 
NE PTGGSGVTGNCPHLLEAIPCEEPACYDWKAVRLGDCE PDNGK 
ECG PGTQVQEWCINSDGEEVDRQLCRDAIFPI PVACDAPCPKD 
CVLSTWSTWSSCSHTCSGKTTEGKQIRARSILAYAGEEGGIRCP 
NS S ALQEVRS QJEHPCTVYHWQTGPWGQC I EDTS VSS FNTTTTW 
NGEASCSVGM(7TRKVICVTlVNVGQ\raPKKCPESI^ET\n^C^ 
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Amino acid segment containing signal peptide"" 
(A= Alanine, C=Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, F-Phenylalanine, G«Glycine, 
H-Histidine, I«Isoleucine, K=Lysine, 
LoLeucine, M=Methionine, NoAsparagine , 
P«Proline, Q=Glutamine, R=Arginine, 
SoSerine, T=Threonine , v=Valine, 
W=Tryptophan, Y=Tyrosine, X=. Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\opoBsible nucleotide insertion) 




• 




P CiCKD CI VT P YS DWTS C P S \S CKE6DS S I RKQSRHR VI IQLPAN 
GGRDCTDPLYEEKACEAPQACQSYRW\KTHKW\HRCQ\LVP\WS 
VQQDS P \GAQEGCG PGRQARAI TCRKQDGGQAGIHECLQ YAG P V 
PALTQACQ I PCQDDCQLTS WS KFS S CNGDCGAVRTRKRTLVG KS 
KKKEKCKNSHLYPLIE'rQYCPCDKYNAQPVGNWSDCILPEGKVE 
VLLGMKVQGD IKECGQG YR YQAMACYDQNGRLVETSRCNS HGY I 
EEACIIPCPSDCKLSEWSNWSRCSKSCGSGVKVRSKWLRBKPYN 
GGRP C P KLDHVNQ AQ VYE WP CHS D CN Q YLWVTE PW S I CKVT P V 
NMRENCGEGVQTRKVRCMQNTADGPSEHVEDYLCDPEEMPLGSR 
VCKLPCPEDCVISEWGPWTQCVLPCNQSS FRQRSADP IRQPADE 
GRSCPNAVEKEPCNLNKNCYHYDYNVTDWSTCQLSBKAVCGNGI 1 
KTRWLDCVRS DGKS VDL KYCEALGLEKNWQMNTS CMVECP VNCQ 
LSDWSPWSECSQTCGLTGKMIRRRTVTQPFQGDGRPCPSLME)QS 
KP CP VKPCYRWQ YGQW S PCQVQEAQCGEGTRTRNIS CWS DGSA 
DDFSKWDBEFCAJDIBLI I DGNKNM VLB E S CS QPCPGD CYLKDW 
SSWSLCQLTCV7*GEDLGFGGIQVRSRPVIIQELENQHLCPEQMIi 
ETKSC^DGQCYEYKWMASAWKGSSRTVWCQRSDGINVTGGCLVM 
SQPDADRSCNPPCSQPHSYCSETKTCHCEEGYTEVMSSNSTLEQ 
CTIiI PVWLPTMEDKRGDVKTSRAVHPTQPSSNPAGRGRTWFLQ 
PFGPDGRLKTWVYGVAAGAFVIiLI FIVSM I YLACKKPKKPQRRQ 
NNRLKPLTLAYDGDADM 


§613 


1161 


710 


GAF IAfi VP V$ PVL IRYPNS LDTTS WAWRG PG VIiKVLWLTASQPC "~ 
fi I VDVE FLP VYHPS PEE SRDPTL YANNVQRVMAQADG I PATECE 
FVGSLPVIWGRLKVALEPQL/WGTGKSASBGNAVRKLCGRWGR 
ARPESNDQPGRVCQAATAL 


6014 


2857 


613 


EAVAGGMEKS RMWLP KG PDTLC FDKDEFMKED FD VDHF VSDCRK ^~ 
RVQLEELRDDLELYYKLLKTAMVELINKDYADF\VNLSTNLVSM 
DKALNQLSVPLGQLREBVLSLRSSVSEGIRAVDBRMSKQEDIRK 
KKMCVLRLIQVTRSVEKIEKILNSQSSKETSALEASSPIiLTGQI 
LERIATEFKQLQFHACQSK\GMPLLDKVRPRIAQITAMLQQSLE 
GLLLEG LQTSDVD 1 1 RHCLRTYAT I DKTRDAEALVGQVLVKP Y I 
DEVI I EQFVESHPNGLQVMYNKLIiE FVPHHCRLLREVTGGAI SS 
; EKGNTVPGYDFLVNSVWPQIVQGLEEKLPSLFNPGNPDAFHEKY 
TI SMDFVRRLEROCGSOAS VKRIjRAFTPTVYHQ vw stkwnt.d vvwriT 

rfrbiagslbaaltdvledapabspycllashrtwsslrrcwsd 
emflpllvhrlwrlhsgrfwarysvfv\n\elsiirpisnespke 
ikkplvtgskeps itqgntedqgsgpsetkpws isrtqlivyw 
adldklqeqlpelleiikpklemigfknfssisaaled3qssps 
acvps lss k 1 1 qdissds c fgflksaiievprl yrrtnkevpttas 
syvdsaiikplfqlqsghkdklkqai iqqwlegtls esthkyyet 
vsdvlnsvkkmeeslkrlkqarkttpanpvgpsggmsdddkirl 
qlaldveylgeqiqklglqasdiks fsaiaelvaaakdqatakq 

P 


6015 
> 


13 


2237 


AEGCAERRGTEPWELSMSWESGAGPGLGSQGMDLVWSAVfYGKC 
VKGKG SIjPIiSAHG I WAWLS RAEWDQVTVYLPCDDHKLQRYALN 
RITVWRS RSGNELPLAVAS TADLI RC KLIjDVTGGLGTDEIiRLIiY 
GMALVRFVNLISERKTKFAKVPLKCIAQEVNI PDWI VDLRHEIiT 
HPaCWPHINDCRRGCYF^DWLQKTYWCRQLENSLRETWELEEFR 

egieeedqeedkniwdditeqkphpqddgkstesdvkadgdsk 
gsbevdshc^kalshkelyerarellvsyeeeqftvlbkfrylp 

KAI KAWNNPSPRVECVLA ELKG VTCENREAVLDAFIjDDGFLVPT 
FEQLAAIiQIEYEENVDLNDVLVPKPFSQFWQPriLRGLHSQNFTQ 

allermlselpalgisgirptyilrwtvtilivantktgrnarrf 

S ACQ WEARRGWRIiFNCSAS LDWPRMVES CLGS PCHAS PQLLR 1 1 
F\KAMGQGIiQDE\EQEKLLRICSIYTQSGENSLVQEGSEASPIG 
KSPYTLDSLYWSVKPASSS FGSEAKAQQQEEQGS VNDVKEEEKE 
EKEVLPDQVEEEEENDDQEEEEEDEDDEDDBEEDRMEVGPFSTG 
QESFTAENARLLAQXRGALQGSAWQVSSEDVRVfDTFP\LGRMPR 
SRPRTPAELMLENYDTHVI FWTKPVL\EQRLEPSTCK\TDTLGL 
\SOGVGS\GNCSNSSSSNFRGAFLI^ARGSLH\GL\KTGliQLF 


! 6015. 


13 


2237 


ASGCAERRGTEP WE LSMS W ES GAG PGLGSQGMDLVWSAWYGKC J 
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Predicted end 
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amino acid 
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Amino acid segmenc containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine , G-Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P^Proline, Q=Glutaraine, R=*Arginine, 
S-Serine, T=*Threonine , V«Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








vkgkgsl plsahgi WAWLsraewixjvtvyltcddhklqryaln 
ritvwrsrsgnelpi^vtvstadlirckllitvtgglgtdelrlly 

GMALVRFVN LISERKTKFAKVPIiKCLAQEVNI PDW I VDLRHELT 
HKKMPHINDCRRGCYFVIiDWLQKTyWCRQLENSIjRETWELEEFR 
EGIEEEDQEEDKNIWDDITEQKPEPQDDGKSTESDVKADGDSK 
GSEEVDSHCKKALSHKELYBRARELLVSYEEEQFTVLEKFRYLP 
KAI KAKNNP S PRVEC VLAELKGVTCENREAVLDAPLDDG FLVPT 
FEQLAALQI EYEENVDLNDVL VPKPFSQFWQPLLRGLHSQNFTQ 
ALLERMLS ELPALG ISGI RPTYILRWTVELIVANTKTGRNARRP 
SAGQWEARRGWRLPNCSASLDWPRMVESCLGSPCWASPQLLRII 
F \ KAMGQG LQDE \BQEKLLRI CS I YTQSGBNSLVOEGS EAS P IG 
KS PYTLDSLYWS VKPASSSFGSEAKAQQQEEQGSVNDVKEEEKE 
BKEVLPDQVEEEEBNDDQEEBEEDEDDEDDEEBDRMEVGPFSTG 
QESPTABNARLuAQKRGAIjQGSAWQVSSEDVRWDTFP\LGRMPR 
SRPRTPAELMLENYDTHVIFWTKPVL\EQRLEPSTCK\TDTLGL 
\SCX5VGS\GNCSNSSSSNFRGAPLLEARGSLH\GL\KTGLQLF 1 


6017 


2D3 


34^ 


SHQEIEQNSAMAPRKRGGRGISFIFCCFRNNDHPEITYRLRNDS 
N FALQTEMB PAX PMP P VE ELDVW FSELVDELDLTD KHREAM FALP 
AE KKWQI YCS KKKDQE ENKGATS WPB FYI DQLNS MAAR KS LLAL 
EKEEBBERSKTI ESLKTALRTKPMRFVTRFI DLDGLSCI LNFLK 
TMDYETSESRIHTSLIGCIKALMNNSQGRAHVLAHSESINVIAQ 
S LSTENIKTKVAVLE I LGAVCLVPGGHKKVLQAMLHYQKYASER 
TRFQTLINDLDXSTGRYRDEVSLKTAIMSF1NAVLSQGAGVESL 
DFRLHLRYE\ FIMLGIHPVMDKLRXHENSTLDRHLDFFEMLRNE 
DELE FAKRFE LVH IDTKS ATQMFELTRKRLTHS EAYPHFMS I LH 
HCLQMP YKRS GNTVQYWLLLDR 1 1 QQI VIQNDKGQDPDSrP LEN ■ 
FNIKNVVRMLWENEVKQWKEQABKMRKEHNELQQKLEKKEREC 
DAKTQEKEEMMQTLNKMKEKLEXETTEHKQVKQCr/ADLTAQL^ 
LSRRAVCASIPGGPSPGAPGGPFPSSVPGSLLPPPPPPPLPGGM 
LPPPPPPLPPGGPPPPPGPPPIiGAIMPPPGAPMGLALKKKSIPQ 
PTNALKS FN WS XL PENKLEG 7VWTE I DDTKVFKILDLEDLERTF 
SAYQRQQDFFVNSNSKQKEADAIDDTLSSKLKVKELSVIDGRRA 
QNCNILLSRLKLSNDEIKRAILTMDEQEDLPKDMLEQLLKFVPE 
KSDIDLLEEHKHELDRMAKADRFLFEMSRiNHYQQRLQSLYFKK 
KFAER VAE VKP KVEAIRSGSEEVFRS GALKQLLE VVLAFGNYMN 
KGQRGNAYGFKISSLNKIADTKSSIDKNITLLHYLITIVENKYP 
SVLNI«NEELRDIP07\AKVNMTELDKEISTLRSGLKAVETELEYQ 
KSQPPQPGDKFVSWSQFI TVASFSFSDVEDLLAEAKDLPTKAV 
KHFGEEAGKIQPDEFFGIFDQFLQAVSEAKQBNENMRKKKEEEE 
RRARMEAQLKEQRERER KM RKAKENS EES GEFDDL VS ALRS G E V 
FDKDLSKLKRNRKRITNQMTDSSRERPITKLNF 


6618 
6019 


13 
2 


2510 
1065 


TISQSGGIRRRREAVWFEWNMDFSRLHMYSPPQCVPENTGYTY "' 
ALSSS YSSDALDFETEHKLDPVFDS PRMSRRSLRLATTACTLGD 
GEAVGADSGTSSAVSLKNRAARTTKQRRSTNKSAFSINHVSRQV 
TSSGVSYGGTVSLQDAVTRRPPVLDESWIREQTTVDHFWGLDDD 
GDLKGGN KAAIQGNGDVGAG AATGHNGFFCSKCNMLSERKD VLT 
AHPAAPG PVSRVYSRDRNQKCDDCKGKRHLDAHPGRAGTDWH I W 
ACAG YFLLQ1LRR IGAVGQAVS RTAWS ALWLAWAPGKAASG VF 
WWLGIGWYQFWLISWIjNVFLLTRCLRNICKFLVLLIPLFLLLG 
LSl^GO^\NFFSFLPVLNWASMHRTQRVDDPQDVFKPTTSRLKQ 
PLQGD SEAFP WH WMSGVEQQ VAS LSGQ CHHHG ENLR ELTTLLQK 
LQARVDQMEGGAAGPSASVRDAVGQPPRETDFMAFHQEHEVRMS 
HLEDIIX5KLREKSEAI0KELEQTKQKTISAVGEQLLPTVEHLQL 
ELDQLKSELSSWRHVKTGCETVDAVQERVDVQVREMV3CLLFSED 
QQGGSLEQIjIjQRFSSQFVSKGDJLQTMIiRDLQtiQILRNVTHHVSV 
TKQLPTSEAWSAVSEAGASG I TEAQARAI VNSALKLYSQDKTG 
M VDFALBSGGGS ILSTRCSETYETKTALMS LFGI PLW YFS QSPR 
WI QPDI YPGNCWA FKGSQG YLWRLSMM I HPAAFTLEH I PKTL 
S PTGNIS SAP KDFAVYGLRNE YQEEGQLLGQFTYDQDGES LQMF 
C3ALKRPDDTAFQIVELRIFSKWGHPEYTCLYRFRVHGEPVK 
TPNDREPPPQRPPSSRRASHIiAQEITSAASI^DQTQILGSLTTaH 
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~ Predicted " 
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nucleotide 
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corresponding 
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amino acid 
residue of 
amino acid 
sequence 


| Predicted end" 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino aczo segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Pfaenylalanine, G=Glycine, 
H=Histidine, I»Isoleucine, K»Lysine, 
L= Leu cine, M«Methionine, NoAsparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=> Threonine, VaValine, 
W=Tryptophan, YoTyrosine, X=Unknown, *«stop 
Codon, /^possible nucleotide deletion, 
_ \=pos Bible nucleotide insertion) 








PVITSAIRSMPGISSQILTNAQGQVIGTLPWWNSASVAAPAPA 
QSLQVOAVTPQLLLNAQGQVIATLASSPLPPPVAVRK\PSTPES 
LLKSBVQPIKPTPTVPOPAWIASPAPAAKPSASAPIPITCSBT 
PTVSQLVS KPHTPSLDEDGINIiEEIRBFAKNFKI RRLSLGLTQT 
QVGQALTATEGPAYSQS AI CR FB KLDI TPKS AQKLKPVLEKWLN 
EAE LRNQEGQQNLMEFVGGEPS KKRKRRTS FTPQAI EALNAYFE 

KNPLPTGQEITEIAJCELtrTOREVVRWFCNRRO/rLKNTSKLNVF 
QIP 


6020 
""^021 


4 953 


S49 


EAIQFEVSIGNYGNKFDTTCKPIiASTTQySRAVFDGNYYYYIiPW 

AHTKPVVTLTSYWEDISHRLDAVNTLIiAMAERLQTNIEALKSGI 

QGKIPANQLAELWIiKLIDEVIEDTRYTLPLTRGKANVTVLDTQI 

RKLRSRSLSQIHEAAVRMRSBATDVKSTLAE1BDWLDKLMQLTE 

EPQNSKPDIIIWMIRGEKRLAYARIPAHQVLYSTSGENASGKYC 

GKTO/riFLKYPQEKNNGPKVPVELRVNIWLGLSAVEKKFNSFAE 

GT PTVFAEM YENQALMFG KWGTSGLVGRKKFSDVTGKI KLKRE F 

FLPPKGWEWBGBW1VDPERSLLTEADAGHTEFTDEVYQNESRYP 

GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 

AVDEKGWBYGITIPPDHKPKSVIVAAEKMYHTHRRRRLVRKRKKD 

LTQTASSTAGAMEELQDQEGWEYASLIGWKFHNICQRSSDTFRRR 

RWRRKMAPSETHGAAAI FKLEGALGADTTEDGDEKSLEKQKHSA 

TTVTGANTPIVSCKFDRDY1YHLRCYVYQARNLLALDKDSFSDP 

YAH I C FLHRS KTTE I 1 HSTLNPTWDQTI I FDEVE I YGEPQT VLQ 

NP PKVI MELFDNDQVGKDE FLGR S I FSPWKIjNS EMD I TPKLL W 

HPVMNGDKACGDVLVTAELILRGKDGSNLPILPPQRAPNLYMVP 

QG I R PWQLTAI EILAWGLRNMKNFQMAS ITS PS L, WBCGG ERV 

ES WI KNLKKTPWFPSS VLBWKVFLPKEEI/YMP PL VI KVI DHRQ 

FGRKPVVGQCTIERLDRFRCDPYAGKEDrVPQLKASLZiSAPPCR 

DIVlEMEDTKPLLASKCIiSSKSTAI^XMASPATVHLTEKEEEIV 

DWWSKFYASSGEHEKCGQYIQKGYSKLKIYNCELENVABFEGLT 

DFSDTFKLYRGKSDENEDPSWGBFKGSFRIYPLPDDPSVPAPP 

RQFRELPDS VPQECTVRI YIVRGXELQPQDNNGLCDFYI KITLG 

KKV I E \ DRDHY I PNTLNP VFGRMYELS CYLP QEXDLKIS VYD YD 

TPTRDE KVGBTI IDLENPF\LSRFG\SHCG\lPEEYCVSGVNTW 

RDSfiR\ PTQ \liLQNVARFKGPPQP ILSEDGSRIR YGGRD YS LDE 

FEANKIIiHQHIiGAPEERLALHILRTQGLVPEHVETRTlHSTFQ P 

NIS\RYYI*RVI IWNTKDVIliDEKSITGEEMSDI YVKGWI PGNEE 

NKQ KTD VHYRS LDG EGN FNWRFVFP FD YLPAEQL C I VAKKEH F W 

SIDQTEFRIPPR\lIIQIW\DNDKFS\1jDDYLGFPRTLTCRHTI 

hflqkspggnc/rgldmipdlkt^mnplkaktaslfeqecsmkgww 

PCYAEm3ARVMAGKVEMTLEII^KEJU3ERPAGKGRDEPNMNP 
KLDLPNRPETSFljWFTNPCKTMKFIVWRRFKWVIIGLijFLLIljL 
LFVAVLLYSLPNYLSMKIVKPNV 




4953 


549 

J 
1 
< 
} 
] 


EAIQFEVS iGNYGNKFDTTCKPLASTTa YSRAVFDGN YYYY£pTT~ 

AHTKP WTLTS YW EDI SHRLDAVNTLLAMAERLQTNIE ALKSG I 

QGKIPANQLAELWLKLIDEVIEDTRYTLPLTEGKANVTVLDTQI 

RKLRSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKIiMQLTE 

E PQNS M PD 1 1 IWM I RGEKRLAYARI PAHQVLYSTS GENASGKYC 

GKTQTIFLKYPQEKNNGPKVPVELRVNIWLGIiSAVEKKFNSFAE 

GTFTVFAEMYEKQAIiMFGKWGTSGLVGRHKFSDVTGKIKLKREF 

FLPPKGWEHEX3EWIVDPERSLLTEADAGHTBPTDEVYQNESRYP 

GGDWKPAEDTYTDANGDKAAS PSELTCPPGWEWEDDAWS YDINR 

AVDEKGV7EYGITI PPDHKPKSWVAAEKMYHTHRRRRLVRKRKKD 

LTQTASSTAGAKEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 

RWR RKMAP S ETHGAAAI FKLBGALGADTTEDGDEKSLEXQKHSA 

TTVTOA>rrPIV3CNFDRDYIYHLRCYVY0ARNLIiALD 

YAHICFLHRSKTTEIIHSTLNPrWDQTIIFDBVEIYGEPQTVLQ 

^ P P KVTMEL FDNDQVGKDEFLGRS I FSPWKIiNSEMD I TP KLL W 

-TPVMNGDKACGDVLVTAELILRGKDGSNLPIIjP PQRAPNLYMVP 

3G IRP WQLTA I E IliAIJGtiRNMKNFQMAS I TS PSLWECGGERV 

^SWIKNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 

rGRKPVVGQCTIERI^RFRCDPYAGKBDIVPQLKASLLSAPPCR 



429 



WO 01/53312 



PCT/US00/34263 



1 SECT 
ID 
NO: 
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nucleotide 
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•Hti^uw qi-iu acgment containing signal peptide 
{AoAlanine, OCysteine, D=Aspartic Acid, B=» 
Glutamic Acid, FsPhenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, MoMethionine, NoAsparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, TaThreonine , V«Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








DIVIEMBDTKPLIiASKCLSSMSTAI^KMASPATVHLTEKEEEfS^ - 

DWWSKFYASSGEHEKCGQYIQKGYSKLK1YNCELENVAE?EGLT 

DPSDTFKLYRGKSDENEDPSWGEFKGSPRIYPLPDDPSVPAPP 

ROFRELPDSVPQECTVRIYIVRGLELQPQDNNGLCDPYIKITLG 

KKVIE\DRDHYIPNTLNPWGRMYELSCYLPQEKDIjKISVYDYD 

irA xwoivv i 1 iujjCum fr \ ItoKr C» \6HCQ \ I PEEYCVSGVNTW 

RDSLR\PTQ\LLQNVARFKGPPQPILSEDGSRIRYGGRDYSLDE 

FEANKILHQHI/3APEERLALH I LRTQGLVP EHVETRTLHS TFQP 

NI S \ R YYLRVI I WNTKD VI LD EKS I TGEEMSDI YVKGWI PGNEB 

NKQKTDVH YRS LDGBGN FNWR FVFP FDYL PAEQLCI VAKXEHFW 

SIDQTEFRIPPR\LI IQI W\DNDKFS \LDDYLGFPRTI»TCRHTI 

HFLQKSPGGNC/RGLDMIPDLKAMNPLKAKTASLFEQKSMKGWW 

PCYAEKDGARVMAGKVEMTLEILNEKEADERPAGKGRDEPNMNP 

KLDLPNRPET3FLWFTNPCKTMKFIVWRRFKWVIIGLLFLLILL 

LFVAVLLYSLPNYLSMKI VKPKV 


6022 


4953 


543 


EAIQFEVS IGNYGNKFDTTCKPLAS TTQYS RAVFD&J VyY YLPW ' 
AHTKP WTLTS YW BD I SHRIiDAVNTLLAMAERI»QTNIEALlCSG I 
QGKI PANQLAEI>WLKI>I DE VIEDTRYTLPLTEGKANVTVLDTQI 
RKLRSRSLSQIHEAAVRMRSEATDVKSTLAEIEDWLDKLMQLTE 
EPQNS M PD 1 1 1 WMIRGEKRIA YARI PAHQVLYSTSGENASGKy C 
GKTQTI FLKYPQE KNNGP KVP VELRVN I WLGLSAVE KKFNS FAE 
GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTG KI KLKREF 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAASPSELTCP PGWEWEDDAWS YDINR 
AVDBKGWEYGITI PPDHKPKS WVAAEKMYHTHRRRRLVRKRKXD 
LTQTAS STAGAMEELQDQEGWEYAS L I GWK FHWKQRSSDTFRRR 
RWRRKMAPSETHGAAAI FKLEGALGADTTEDGDE KSLEKQ KHSA 
TTVF GANTP I VS CNFDRD Y I YHLR CYVYQARNL LA LDKDS FS D P 
YAHI CFLHRSKTTEIIKSTLNPTWDQTI I FDEVE I YGEPQTVLQ 
NPPKVIMELFDNDQVGKDEFLGRSI FSP WKLNSEMDITPKLLW 
H PVMNGDKACGDVL VTAELI LRGKDGSNLPILP PQRA PNL YMVP 
QQI RP WQLTAI EILAWGLRNMKNFQMAS ITS PSLWECGGER V 
ES WI KNXiKKTPNFPSSVLFMKVFLPKEELYMPPLVI KVIDHRQ 
^RKPWGQCT3ERLDRFRCDPYAGKEDIVPQLKA5LLSAPPCR 
DIVIBMEDTKPLIjASKCLSSMSTALSKMASPATVHLTEKEBEIV 
DWWSKF YASSGEHEKCGQY IQKG YS KLKI YNCELENVAEFEGI/T 
DFSDTFKLYRGKSD3NEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQFRBLPDSVPQECTVRIYIVRGIjELQPQDNNGLCDPYIKITLG 
KKVIE \ DRDH YI PNTLNPVFGRMYELS CYLPQEKDLKIS VYD YD 
wfiAVtoB xjlX U-ifiNP r \ LSRFG \S H CG \ I PEB YCVSGVNTW 
RDS LR \ PTQ \IiLQNVARFKG FPQ P ILS EDGS R IR YGGRD YSLDE 
FEANKILHQHLGAPEERLALHILRTQGLVPEHVBTRTLHSTFQP 
N IS \ RY YLR VT I WNTKD VILDEKS ITG BEMSDI YV KG WI PGNEE 
NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLC IVAKKEHFW 
SIDQTEFRIPPR\LIIQIW\DNDKFS\LDDYLGFPRTLTCRHTI 
HFLQKS PGGNC/RGLDMI PDLRAMNPLKAKTASLFEQKSMKGWW 
PCYAEKDGARVMAGKVEMTTjE Tl.WFKT?nnPi?Diir wr» d n?DKiMvn 

KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKWUIIGLLFLLILL 
LFVAVLLYSLPNYLSMKIVKPNV 


6023 
| 6024 


102 


916 


SQELGMFVEL^I^NTTPDRAEQGKLTLLCDAiCTDGSFLVHHFL ~ 

SFTLKANCKVCFVALIQSFSHYSIVGQKLGVSLTMARERGQLVF 

LEGL/IVCSGR\VFQAQKBPHPLQFLREANAGNUCPLFEFVREA 

LKPVDSGEARWTYPVLLVDDLSVLLSLGMGAVAVLDFIHYCRAT 

VCTELKGNMVVLVHDSGDAEDEENDILLNGLSHQSHL ILRAEG L 

ATGFCRDVHGQLRILWRRPSQPAVHRDQSFTYQYKIQDKSVSFF 

AKGMSPAVL 




3 


3260 


FI*S FLCY PRFRCLFCLQ FAI P ASRMEQLNELEL LMEKS FWE EAE 
LPAELFQKKWASFPRTVLSTGOTNRYLVLAVNTVQNKEGNCEK 
RLVITASQSIiENKELCILRNDWCSVPVEPGDIIHLEGDCTSDTW 
IIDKDFGYLILYPDMLISGTSIASSIRCMRRAVLSETFRSSDPA 
rRQMLIGTVLHEVF^KAINNSFAPEKIjQELAFQTIQEIRHIrKEM 
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SEQ 
ID 
NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
co rr c sp ondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 



Ammo acid segment containing signal peptide 
(A«Alanine, OCysteine, D«Aspartic Acid, E* 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=IIistidiae, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P*Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VoValine, 
W=Tryptophan, Y^Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 



"6025" 



~J§TT 



YRLNLSQDE I KQEVEDYLPS FCKHAGDFMH KNTSTD FPQMQ LS L 
PSDNSKDNSTCNIEWXPMDIEESIWSPRFGLKGKIDVTVGVKI 
HRGYKTKYKIMPLELKTGKESNSIEHRSQWLYTLLSQERRADP 
EAGLLLYLKTGQMYPVPANHLDKRELLKLRNQMAFSLPHRIS KS 
ATRQ KTQLAS LPQ I IEEEKTCKYCSQIGNCALYSRAVEQQMDCS 
S VP I VMLP K I E EETQHLKQTHLEYPS LWCLPILTLB SQS KDNK KN 
HQN I WLMPAS EME KSGS C XGNL IRMEHVXI VCDGQ YLHNFOC KH 
GAI P VTNLMAGDRVI VSGEERSLFALSRGYVKEINMTTVTCLLD 
RNLSVLPESTLFRLDQEEKNCDIDTPIjGNLSKIjMENTFVSKKLR 
DLI I DFREPQFIS YhSS VLPHDAKDT VAC ILKGLNKPQRQAM KK 
\HjLSKDYTLI VGMPGTGKTTTI CTLiVRILYACGFSVLIiTS yths 
AVDNILLKLA3CFKIGFLRSR\QIQKVHPAIQQFTEHEICJRSKSI 

KS \ialleblytsqlidattcmginhpifsrki fdfci vdeasq 
isqpiclgplffsrrpvlvgdhqqlpplvlnrearalgmseslf 

KRLEQNKSAWQIjTVQ YRMNSKIMSLSNKIjTYEG 10^ CGS DKVA 

e^vinlrhfkdvklblefyadysdnpwlmgvfepnnpvcflntd 
kvpapeqvbkggvsnvteaklivfjutsifvkagcspsdigiiap 
yrqqlki indliiarsigmvbvntvdkyquxrdks ivlvs fvrsn 

KDGTVGEIiLKDNRRT JJVAITRAKHKLI LLGCVPSLNCYPPLEKL 

LNHLNSEKLI idlpsreheslchilgdfqrb 



GG FPAQSDHLPPVFPLRSDLLITMS TIi YVS PH PDAFPSLRAIiIA" 
ARYGEAGEGPGWGGAHPRICLQPPPTSRTSFPPPRLPALEQGPG 
GLWVWGATAVAQLLWPAGIjGGPGGSRAAVLVOQWVSYADTELIP 
AACGATLPALQLRSSAQDPQAVIiGALGRALS PLEEWLRLHTYLA 
GBAPTLADLAAVTALLLPFR YVLDPPARR I WJNNVTRWFVTCVRQ 
PEFRAVLGEWLYSSARPLSHQPGPEAPALPKTAAQLKKEAKKR 
EKLBKFQQKQKI QQQQP PPGEKKP KP EXREKRDPGVI TYDL PTP 
PGEKKDVSGPMPDS YS PRYVEAAWYP W WEQQGFFKPE YGRPNVS 
AANPRGVFMMCI PPPNVTGSLHLGHALTNAIQDSLTRWHRMRGB 
TITiMNPGODHAG I ATQVVVEKKLWREO/3LSRHQIX3REAFXiQEVW 
KWKEEKGDRIYHQLKXLG3SLDWDRACFTMDPKLSAAVTEAFVR 
LHEEG 1 I YRSTRLVNWS CTLNSAI SD I EVDKKELTGRTLLSVPG 
YKEKVE FGVLVS FAYKVOGSDS DEBVWATTRI ETMLGDVAVAV 
HPKDTRYQHLKGKNVIHPFLSRSIiPIVFDEFVDMDFGTOAVKIT 
PAHDCJND YEVGQRHGIiEAI SIMDSRGALINVPPP FLGLPRFBAR 
KAVLVALKERGLFRGIEDNPMWPLCNRSKDWEPLLRPOWYVR 
CGEMAQAASAAVTRGDLRI LPERHQRTWHAWMDN I RE \ WCMFPG 
KLWWG \ HR\ IPAYFVTVSDPAVPPGBDPDGRYWVSGRNEAEARE 
KAAKEFGVSPDKISIiQQDEDVLDTWFSSGIiFPLSIIjGWPNOSED 
1*SVFYPGTIJI£TGHD1LFFWARMV^U^LKLTGRLPFREVYLHA 
I VRDAHGRKMSKSLGNVIDPLDVIYGI SLQGLHNQLLNSNLDPS 
EVEKAKEGQKADFPAGX PB CGTDALRFGLCA YMSQGRDINLD VN 
R I LGYRHFCNKLWNATKFALRGLG KG FVPS PTSQ PGGHESL VDR 
WIRSRLTEAVRLSNQGFQA YD FPAVTTAQ YSFWL YELCD VYI»E C 
LKPVLNGVDQVAAECARQTLYTCLDVGLRLLSPFMPFVTEELFQ 
RLPRRMPQAPPSLCVTPYPEPSECSWKDPEAEAALELAIiSITRA 
VRP\LRADYNLHPESGPTCFLEVAD\EATGALASAVSGYVOG PG 
QAQVWAVAEP WGLPAP \QGCAVALASDRCS I \ HLQLQG \ LLD P 
AREbG\KLQ\AKRVEAQ\ROAQ\RLR\ERRA\ASGNPVKVPL\E 
VQEAPEAKLQQTEAELRKVDEAIALFQXML 



2674 



514 



GP ITFLKKKAlWKDMPLRIHVIiLGtAITTL VQAVDKKVDCPRLC 
TCEIRPWFTPRSIYMEASTVDCNDLGLLTFPARLPANTQILDLQ 
TWMIAKI EYSTDFP VNLTGLD&SQNNLSS VTNINGKKMPQLLS V 
YLEENKLTELP EKCLS ELS NLQE LYINHNLLSTISPGAF I GLHN 
I»LRLHLSfSNRLWlNSKWFCALPNIiEILMIGENPIIRIKDMNF^ 
PLlNLRSLVIAGlNLTEIPDNALVGLENLESISFYDNRLriCVPH 
VALQKWNLKFLDLKKNP I NR I RRGD FSNMLHLKELG I NNM PEL 
ISIDSLAVDNLPDLRKIEATNNPRLSYIHPNAFFRLPKLESLML 
NSNALSALYHGTIESLPNLKEI S IHSNP IRCDCVIRWMNMNKTN 
IRFMEPDSLFCVDPPEFQGQNVRQVHFRDM^IEICLPLIAPE3FP 
SMLKVEAQ3 YVSFHCRATA\BPQPE I YWITPSGQKLLPNT\LTD 
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■ SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid, segmenc containing signal peptide " 

/uaiuuc, i.-^yBcej.ne, u«»Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HoHistidine, I=Isoleucine, K»Lysine, 
L=Leucine, M^Methionine, N^Asparagine, 
P-Proline, Q«Glutamine, Rt=Arginine # 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan. Y=s Tyrosine X=rjnlcnrvim ♦.oka^ 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KFYVHSEGTLDINGVTPKEGGLYTCIATNLVGADLKSVMIKVDG 
SPPQDNNGSLN1KIRDIQANSVLVSWKASSKILKSSVKWTAPVK 
TENSHAAQSARIPSDVKVYNLTHLNPSTE YKIC1DI PTIYQKNR 
KKCW^TTKGLHPDQKEYEKNNTTTLMACLGGLLGIlGVICLrS 
CLSPBMNCDGGHSYVRNYIiQKPTFALGELYPPLINLWElAGKEKS 
TSLKVKATVIGLPTNMS 


6027 
6028 


5254 


4148 


waiiuvtc vncums ^ juyQaao 1 Vt Kx# v Voro^UPIjPVRYYDKTiTT'K' 

P I S FYLS £ LEEXJLAfnCPRLEBGFNVALE PLACR 0PPLS S QRPR T 
LLCHDMMGGYLDDRFIQGSWQTPYAFYHWQCIDVFVYFSHHTV 
TI PPVG WTNTAHRHGVCVLGT FI TEWNEGGRLCEAFtiAGDE RS Y 
QAVADRLVQIT\RFFRFDGWLINIENSLSLAAVGNMPPFLRYLT 
TQLHRQVPGGLVLWYDSVVQSGQUCWQDELNQHNRVFFDSCDGF 
FTNYNWRE EHLERM LGQAG E RRAD VYVG VD VFARGNVVG3RFDT 
DKVGGGFRPRASG PVP PLG PHFLMDLPFPS APQRNDS S CSSQS G 
DPVALRNRCPAPAKLCPH 




120 


3432 


NCLIiLQAXGFHGEIEDIiQQWLTPTERHLLASKPLTOLPETAKEQ 
Un^HMEVCAAPEAKEETYKSLMQWSQQMLARCPKSAETNIDQDl 
NNLKEKWESVETKLNER\KT\KLEEALNLA\MEFHNSL\QDFIN 
WLTQAEQTLNVASR PS L ILDTVLFQIDEHKVFANEVNSHREQI I 
ELDXTGTHLKYFSQKQDWLIKNLLISVQSRMEKWQRLVERGR 
SIJ}DARKRAKQFHBAWSKLMBWLEESEKSLDSELEIANDPDKIK 
TQLAQHKE FQKS LGAIOIfl VYDTTNRTGRS LKEKTSLADDNLKLD 
DMLSSLRDKWDTICGKSVERQNKLEEA\LLFSGQFTDALQALID 
WIiYRVEPQLAEDQPVHGD I DLVMNLI DNHKAFQKELG KRTSS VQ 
ALKRS ARBLI EGSRDDSS WVKVQMQELSTRWETVCALS I SKQTR 
IiEAAXjRQAEEFHSWHALLEWLAEAEQTLRFHGVLPDDEDALRT 

lidqhkefmkkleekraelnkattmgdtviaichpds ittikhw 
it: irarfeevijwakqhqqrlasalagliaxqelleallawlq 

WAETTLTDKDKEVI PQE I EEVKAL1ABHQTFMEEMTRKQPDVDK 
VTKTYKRRAADPSSLQSHIPVLDKGRAGRKRFPASSLYPSGSQT 
QIETKNPRVNIOjVSKWQQVWLIJ^RRRKUIDALDRI^E^ 
NFDFDIWRKKYWRWMNHKKSRVMDFFRRXDKDQDGKITRQEFID 
vaiudbju ir iaRIjEMSAVADIFI)RIX3DGYIDYYEFVAALHPNXDA 
YXPITDADKIEDEVTRQVAKCKCAKRFQVEQ IGDNKYRFFLGNQ 
FGDSQQLRLVRILRSTVMVRVGGGWMALDEFLVKNDPCRAKGRT 
NMBLREKFIIADGASQGMAAFRPRGRRSRPSSRGASPNRSTSVS 
SQAAQAAS PQVPATTTPKI LHPLTRNYGKP WLTN3 KMS TP CKAA 
ECSDFPVPSAEG TP IQGS KLRLPG YLSGKDFHSGEDSGIj I TTAA 
ARVRTQFADSKKTPSRPGSRAGSKAGSRASSRRGSDASDFDISE 
IQSVC3DVETVPQTHRPTPRAGSRPSTAKPSKI PTPQRKSPASK 
LDKSSKR 


5029 


1 


3533 

< 

1 

I 
I 


I MPCGSSRX/LRGCWTHPNEP VSDLS YFDCI ES VMENSkVLGES M 
AGISQNAKTGDLPAFGECVGI ASKALCGLTEAAAQAAYLVG I FD 
PNSQAGHQGLVDP I QFARANQAIQMACQNLVDPGS S PS QVLS AA 
TIVAKHTSALCNACRI ASS KTANFVAKRHFVQSAKEVANSTANL 
VKTI KALDGDFSEDNRNKCR I ATAPLIEAVENIiTAFASNPEFVS 
I PAQISSEGSQAQEPILVSAKPMLES SSYLIRTARSLAINPKDP 
PTWSVLAGHSHTVSDS I KSLITS IRD FCAPGQRECD YS IDGINRC 
IRDIEQASIJUVVSQSLATRDDISVEALQEQLTSVVQEIGHLIDP 
I ATAARG EAAQLGHKGTQLAS YFEPL I LAAVGVAS KI LDHQQQM 
TVLDO/TKTLAESALQMLYAAKEGGGN P KAQHTHDAI TEAAQLMK 
EAVDDIMVTLNEAAS EVGLVGGMVDAIAEAMSKLDEGTPPEPKG 
TPVDYQTTWKYS KAIAVTAQEMMTKS VTNPEELGGliASQMTSD 
YGHLAFQGQMAAATAEPEE IGFQIRTRVQDLGHGCI FLVQKAG\ 
ALQVCPTDS YTKRELI ECARAVT E KVS L VLSALQAGNKGTQAC I 
TAATAVS GI I ADLDTTI MFATAGTLNABNSET FADHRENT LKTA 
iCALVEDT KLL VSGAAS T PDK LAQAAQS S AAT I TQLAEWKLGAA 
3LGSDDPETQWLINAIKDVAKALSDLISATKGAASKPVDDPSM 
^QLKGAAKVMVTNVTSX.LKTVKAVEDEATRGTRALEATIECI KQ 
2LTVFQS KDVPEKTS S PEES I RMTKGITMATAKAVAAGNS CRQE 
>VIATANLSRiCAVSDMLTACKQAS FHPDVSDEVRTRALRFGTEC 
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SEQ 
ID 
KO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid searaent coTitAininn clonal nani- ■{ Ac* 
(A= Alanine, C=»Cysteine, D»Aspartic Acid, E« 
Glutamic Acid, F*> Phenyl alanine, G-Glycine, 
H=Hietidine, Ialsoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine , 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, TVThreonine , Va Valine, 
W=Tryptophan , Y=Tyrosine, X=Unknown, *nStop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLGYLDLLEH VLVI LQKPTPELKQQLAAFS KR VAQAVTE L I QAA 
EAMKGTEWVD PEDPTVI AETE LLGAAAS I EAAAKKLEQL KPRAK 
PXQADETLDFEEQI LEAAKS IAAATS ALVKS ASAAQREL VAQGK 
VGS I PANAADDGQWSQGLISAARMVAAATSSLCEAANASVQGHA 
SEE KLI S S AKQ VAAS TAQ LLVAC KV KADQDS EAMRRI^AAGNAV 
KRASDNLVRAAQKAAFGKADDDDWVKTKFVGGIAQI IAAQEEM 
LKKERELE EARKKLAQ I RQQQ YK PLPTELREDEG 


6030 


3 


1777 


FPGRG3PALQLEVLICLGLMGLERAIjNVLAPiPYRNIVNLLTEN 
APWNSI^WTVTSYVFIiKFLQGGGTGSTGFVSNLRTFLWIRVQQF 
TSRRVBLLI FSHLHBLSLRWHLGRRTGBVLRZADRGTSSVTGIjI» 
SYLVFNVIPTLADIIIGIIYFSMFFNAWFGLIVFLCWSLYLTLT 

XV V 1 aHK i R.C KAnTWi 1 yjlLNAl K/UCA VUaijijNrKTVKYYNAESYE 

VER YREAI I KYQGLEWKSSASLVLLNQTQNLVIGLGLIAGSLLC 
AYFVTSQKLQVGDYVLFGT YI I QLYMPLNWFGTYYRM I QTNFID 
MENMFDLLKK\3TEVKDLPGAGPFRFQKGRIEFENVHFSYADGR 
BTLQDVSFTVMPGQTLALVGPSGAGKSTIIiRLLFRFYDISSGCI 
RIDGODISQVTQALFRFSHWELCPKDTVLFNDTIADNIRYGRVT 
AGNDE VEAAAQ AAG IHDAI MAF PEG YRTQ VGERG LKLS GGE fCQR 
VAIARTILKAPGIILLDEATSALDTSNERAIQASLAKVCyVNRTT 
IWAHRI^TVVNADQILVIKDGCIVERGiUiEALI^RGGVYADMW 
QIiQQGQEETSEDTKPQTMER 


6031 


160 


1694 


lrmsenldksnvneagksksndSeegledavegadealqkaiks 

DSSSPQRVQRPHSSPPRFVTVEEUiETARGVTNMALAHEIVVNG 
us v *>A"» fiiiir must ulUUC V Vxil\AlrWL7Uli& VQXiS EDPPAYDHA 
I KLVGE I KETLLS FLLPGHTRLRNQ I TEVLDLDL I KQEAENGAL 
DI SKLAEFI IGMMGTLCAPARDBB VJCKJaKDI KEXVPLFRE1FSV 
LDLMKVDMANFAISSIRPHLMQQSVEYERKKFQBI LERQPNSLD 
FVTQWLEEASEDLMTQKYKHALPVGGMAAGSGDMPRIiSPVAVQN 
YAYLKLLKWDHLQRPFPETVLMDQSRFHELQLQ\REQI#TILGAV 
LLVTFS MAAPG I SS QAD FAEKLKM I VK ILLTDMHLPSFHLKDVIj 
TT I G EKVCLEVS S CLS L CG S S PFTTD KETVLKGQ I Q AVAS PDDP 
IRRJMESRILTFLETYIiASGHQKPLPTVPGGLSPVQREIiBBVAI 
KFARL VNYNKMVFC P YYDAI IiS KXLVRS 


6032 


39 


2415 


AARLCRAQPTKSAWMIRDLSKMYPQTRHPAPHQPAQPFKFTISE 

SCDRI KEEFQFLQAQi^SLKLECEKLASEKTEMQRHYVMYYEMS 

YGLN I EMHKQAE I VKRLNAI CAQVI P FLS QEHQQQWQAVERAK 

QVTMAELNAIIGQQQLQAQHLSHGHGLPVPLTPHPSGLQPPAIP 

PIGSSAGLLALSSALGGQSHLPIKDEKKHHDNDHQRDRDSIKSS 

SVSPSASFRGAEKHRNSADYSSESIOCQKTEBKE1AARYDSDGEK 

SDDNLWD VSNEDPSSPRGS PAH3PRBNGLDKTRLLKKDAP IS P 

ASIASSS3TPSSKSKELSLNEKSTTPVSKSNTPTPRTDAPTPGS 

NSTPGLRPVPGKPPGVDPLASSLRTPMAVPCPYPTPFGIVPHAG 

MNGELTSPGAAYAGLHMISPOMSAAAAAAAAMlAYRRQPVVnTrn 

PHHHMRVPAI P PNLTG I PGGKP AYSFHVSADGQMQPVP FPPDA1* 

IGPG I PRHARQINTLNHGEWCAVTISm>TPJlVYTGGKGCVKVW 

DISH PGNKS P VSQLDCLNRDN Y I RS CRLLPDGRTLI VGGEAS TL 

S I WDLAAPTPR IKAELTS SAPACYALAJ S PDSXVCFSCCSDGNI 

AVWDLHNQT LVRQFQGHTDGAS C I D I SNDGTKLWTGGLDNTVRS 

W\DLREGRQIiQQHD/FFTSPVFSLGYCP\TEEWIAVGMENSN\V 

EVLHVTK3 , DKYQLHI*HESCVI*SLKFAHOGKWF\V^ 

W\RT P YG \ ASIF \QS KESSS \ VLS CDI \ S VDDKYI VTGS \GDK\ 

RATVYEVIY 


6033 


39 


24 IS 


AARLCRAQPTKSAWMIRDLSKMYPQTRHPAPHQPAQPFKFTISB 
SCDRI KEEFQFLQAQYHSLKLECEKLASEKTEMQRHYVMYYEMS 
YGLNIE^KOAEIVKRLNAICAQVIPFIiSQEHQQQVVQAVERAK 
QVTMAELNAI IGQQQLQAQHLSHGHGLPVPLTPHPSGLQPPAIP 
P I GSSAGLLALSSALGGQSHLPI KDEKKHHDNDHQRDRDS IKSS 
S VS PSAS FRGAEKHRNSADYS S E SKKQKTEEJC3I AARYDSLX3E K 
SDDNLWDVSNEDPSS PRGSPAHSPRENGLDKTRLLKKDAPIS P 
ASIASSSSTPSSKSKEI^LNEKSTTPVSKSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPGVDPIASSLRTPMAVPCPYPTPFGIVPHAQ 
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1 SEQ~~ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, GaGlycine, 
H^Histidinc, I-Iaoleucine, K=Iiysine, 
L^Leucine, M*Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
WoTryptophan, Y=Tyrosine, X=»Unknown, *«Stop 
Codon; /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KNGBLTS PGAAYAGLHNIS PQMS AAAAAAAAAAAYGRS P WG FD 
PHHI IMR VP AI PPNLTGIPGG K P A YS FHVS AD3QMQ P VP FP P DAL 
IG PGI PRHARQ I NTLNHGE WCAVTISNPTRHVYTGGKGCVKVW 
DISHPGNKSPVSQLDCLNRDNYIRSCRLLPDGRTLIVGGEASTL 
S I WDLAAPTPR IXAELTSS APACYALAIS PDSKVCFSCCS DGN I 
AVWDLHNQTLVRQFQGHTDGAS CIDISNDGTfG^WTGGLDNTVRS 
W\DLRBGRQLQQHD/PPTSPVFSLGYCP\TEEWLAVGMENSN\V 
EVIiHVTKPDKYOLHLHESCVLSIiKFAHCGKWF\VSTGKDOTjI^A 

W\RTPYG\ASIP\QSK5SSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6034 


26B3 


714 


ESGRRRRL1CRRRSPCPGTAGGPGETNPGPGACPRGPREJ2AAAAM 
EIAPQEAPPVPGADGDIEEAPAEAGSPSPASPPADGRLKAAAKR 
VTFPSDEDI VSGAVE P KD P WRHAQNVTVDE VT G AYKQACQ KLNC 
RQ I PKLLRQLQEFTD LGHRLDCLDLKGE KLD YKTCBALEEVFKR 
LQFKWDLE QTNL DEDGAS ALFDM I B Y YESATHLNI S FNKH I GT 
RGWQAAAHMMRKTSCLGYL\DARNTPLLI>HSAPFVARar stpcc 
LAVLHLENASLS GRP LMLLATALKMNKNLREL YIi\ADNKliNGIjQ 
DSAQLGNLLKFNCS LQI LDLRNNHVLDSGLAYICBGLKEQRKGL 
VTL\ VLWNNQLTHTGMAFLGMTLPHTQSLETLNIjGHNP I GNEGV 
RHL.KNGLISNRSVLRU3LASTKLTCEGAVAVAEFIAESPRLLRL 
DLRENEIKTGGLMALSIJUiKVNHSLLRIJaiJJREPKKEAVKSFIE 
TOKALI^IQNGCKRNLVIJ^EREEKEQPPQLSASMPETTATEP 
QP DDEPAAGVQNGAPS PAPSPDSDS DSDS DGEEEEE EEGERDE T 
PSGAIDTRDTGSSEPQPPPEPPRSGPPLPNGLKPEFALALPPEP 
PPGPEVKGGSCGLEHELS CSKNEKELEELLLEASQESGQETL 


6035 


19 


404 


SVTYIX3IILHKNTGALPADPVQLISQTPTPSTKQX31jLSFLGMVG 
YFYLWIPGFAILTKPLCKIiTKENIiADAIDPKSFSHSSFRSLKTA 
LENASTIiALPDSSQPF\SLHTABVQGCVVEZLTQGLGPLPV 


6036 


1745 




LPDVEKIXSRRRGRKMDSVEKGAATSVSNPRGRPSRGRPPKLQRir" 
SRGGOGRGVEKPPHIiAALIIiARGGSKGXPI»KNI KHLAGVPLIGW 
VLRAALDSG AFQ8 VW VSTDHDEI ENVAKQFGAQVHRRS S E VSKD 
S STSLDAI I EFLN YHNEVD I VGN I Q ATS PCLHPTDLQKVAEMIR 
EBG YDS VFSWRRiJQ FRWSEI QKG VREVTEPUfLNPAKRPRRQD 
WDGEL YENGS FYFAXRHLI KMGYLQGGKMAYYEMRAEHSVDIDV 
DIDWPIAEQRVLRYGYFGKEKLKEIKLLVCNIDGCLTNGHIYVS 
GDQKEIISYDVKDAIG1SLLKKSGIEVRLISERACSKQTLSSLK 
LDCKMEVSVSDKLAVVDEWRKEWGLCWKEVAYLGNEVSDEECLK 
RVGLS G APADACSTAQKAVG YI CKCNGGRGA\ I REFAEHI C\ LL 
MEKGLINFMPKNRNLAVNIGEKK 


6037 


2936 


1919 


WTSWWMSSVLTILLFSLQGNKMLNYSAPSAGGYLLPRKPVGTPA 
GGGPPRRHSVTLPSSKFRQNQLliSSLKGEPAPALSSRDSRFRDR 
SFSEGGERLLPTQKQPGGGQVNSSRYKT\ELCRPFEENGACKYG 
DKCQ FAHG IH BLRS LTRHPKYKTELCRTPHT IGFC PYGPRCHFI 
HNAEERRALAGARDLSADRPRLQHSFSFAGFPSAAATAAATGLL 
DS PTS I TP P P I LS ADDLLGS PTLPDGTNNPF \AFS SQELAS LFA 
PSMGLPGGGS PTTFLFRPM5ES PHMFDS PPSPQDSLSDQEGYLS 
SSS SSHSGSDSPTLDNSRRLP I FSRLS ISDD 


6038 


1450 


426 


SSALQEFGTRNHTFGVPLPHRRKQI I S CNICQLRFltfSDSQAAAH 
YKGT KHAFCKL KALE AMKNKQKSVTAKDSAKTTFTS ITTNT INTS 
50KTDGTAGTPAISTTTTVEI RKSSVMTTEITS KVBKBPTTATG 
NSSCPSTETEEBKAKRIi\YCSLCKVAVNSASQLEAHNSGTKHK 
TMLEARNGSGTI KAFPRAGVKGKGPVNKGNTGLQNKT FHCE I CD 
VHVNSETQLKQH1SSRRHKDRAAGKPPKPKYSPYNKLQKTAHPL 
GVKLVFSKEPSKPLAPRILPNPLAAAAAAAAVAVSS PFSLRTAP 
AATLFQTSALPPALLR PA PGP IRTAHTP VL PAP Y 


6039 


4073 


1000 


LDE YEARLTLAKLDDFEEDNEDDDENRVNQEEKAAKI TELINKL 
NFUJEAEKDLATVNSNPFDDPDAAELNPFGDPDSEEPITETASP 
RKTEDSFY*TOSYNPFKEVQ^QYliNPFDEPEAFVTIKDSPPQST 
KXKN 1 RP VDMS K YLYADSSKTEEEELDE SNP FYEPKS TPPPNNL 
WP VQE LETERR VKRKAPAP P V1»S PKTGVLNENTVS AGKDLS TS 
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1 SEQ~ 
ID 
NO: 


"Predicted " 
beginning 
nucleotide 
location 
corresponding 
to firat 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anuno acid segmenc containing signal peptide^ 
* MAaiu -"e, totysteine, D»Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HoHistidine, I=Isoleucine, K=Lysine, 
L»Leucine, MsMethionine, NsAsparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
SeSerine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X- Unknown, *=stop 
Codon, /^possible nucleotide deletion, 1 
\-poBsible nucleotide insertion) 1 








PKPS PIPSPVLGRKPNA3QSIiLVWCKEVTKNYRGVKITNPTTStf — 1 
RNGLSFCAILHHFRPDLIDYKSLNPQDIKENKKKAYDGF71SIGI 
SKLLE PS DMVLLAI PDKLTVM'L'YL YQ IRAHFS GQ ELNVVQ I EEW 
S S KSTYKVGNYETDTNS S VDQ EKF YAELSDLKREP BLQQP I SGA 
VDFLS QDDS VPVNDSG VGESESEHQTPDDHLS PSTAS P YCRRTK 
SDTE PQKS QQS SGRTS GSDDPG I CSNTDS TQAQVLLGKKRLIiKA 
ETLELSDLYVSDKKKDMSPPFICEETDEQKLQTLDIGSNLEKEK 
LENS RSLE CRSDPES F I KKTS LS PTS KLG YS YS RDLDLAKK KHA 
SLRQTESDPDADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 
IiEQARRDAALKAGNKHNTNTAAPFCNRQLSDQQDEERRRQLRBR 
AKUii 1 AKARSG G KMS EL PS YGERAAE KLKERSKASGDENDNIE I 
DTNE E I PEG FWGGGDELTNLENDLDTPEQNS KL VDLKLKKLLE 
VQPQVANS PSSAAQKAVTES S EQDM KSGTEDLRTERLQ KTTERF 
RKPWFSKDSTVRKTQLQSFSQYIENRPEMKRQRSIQEDTKKGN 
EEKAAITETQRKPSEDEVLNKG FKDS \ SQYWGELAALENEQKQ 
IDTRAALVEKRLRYXh©TGRNTEEEEAMMQEWFMLVKKKNAIjIR 
RKNQLSLLEKEHDLERRYELLKRELRAMLAIEDWQKTEAQKRRE 

QLLIJ)EjVALVHKRDALVRDIJ»QEXOAEEEDEHLERTLBQJ«CG 
KMAKKEEKCVLQ 


6040 


475 


1052 


PTAI^lTAPSC^PVQFRQPSVSGLSQITKSLYISNGVAANNia^ 
| I^SNQITMVINVSVEVVNTLYEDIQYMQVPVADSPNSRLCDFFD 
PIADHXHSVEMKQGR\TLI^CAAGVSRSAALCLAYI^KYHAMSL 
LDAKTWTKSCRP 1 1 R PNSGFWEQLIHYE FQLFGKNTVHM VS S PV 
GMIPDIYEKEVRLMIPL 


| 6041 
6042 


2 


3B86 

r I 
l 


TEKDEKTAHNLENVLIHFWERLSE I CVAKISEPEADVESVLGVS | 
NLLG VLQKPKGSL KS SKKKNG KVR FADE I LESN KEMEKCVS S EG 
EKIECWELTTEPSLTHNSSGLLSPLRKKPLEDLVCKLADISINY 
VNER KS EQHLRFLS TLLDS FS S SRVFKMLLGDEKQS IVQAKPLE 
I AKLVQKNPAVQFLYQKL I G WLNEDQRKDFGFLVD I LYS ALRCC 
DNDMERiCKVLDDLTKVDLKWNSLLKl ISKACPSSDKHALVTP WL 
KGDILGEKLVNLADCLCNEDLESRVSSESHFSERWTLI.SLVLSQ 
HVKOTYIiIGDVYVERriVRLHETLFKTiCKI.SEAESSDSSVSFIC 
DVAYNYFSSAKGCLLMPSSBDLLLTLFQLCAQSKEKTHLPDFLI r 
CKLKNTWLSGVNLLVHQTDS S YXESTFLHLSALWLKNQVQAS S L 
D INSLQVLLSAVDDLLNTLLES EDS YLMGVYIGSVMPNDSBWEK 
MRQSLPMQWLHRPLLBGRLSLNYECFKTDFKEQDIKTLPSHLCT 
SALLSKMVLIALRKETVLENNELEKIIAELLYSLQWCEELDNPP 
IFLIGFCEILQKMNITYDNLRVLGNMSGLLQLLFNRSREHGTLW 
S LI I AKLI LS RS IS SDEVKPHYKRKESFFPLTEGNLHTI QS LCP 
FLSKEEKKEFSAQCIPALLGWTKKDLCSTNGGFGHLAIFNSCLQ 
TKS I DDGELLHGI LK 1 1 1 S WKKEHEDI FLFS CNLSE AS PEVLGV 
NIB I IRFLSLFLKYCSSPIiAESEWDFIMCSMLAWLETTSENQAL 
YS I PLVQLFACVS CDLACDLS AFFDSTTLDTIGNLPVKLISEWK 
EFFSQGIHSLLLPILVTVTGENKDVSETSFQNAMLKPMCETLTY 
ISKEQLIiSHKLPARLVADQKTNLPEYLQTLLNTLAPLLLFRARP 

winv *™^ Ui »>-*^riiiji'yxi^Dr^KSYGDEEEEPAIiSPPAALMS I 
LLS IQBDLLENVLG C I P VGQI VTI KPLSED FCYVLGYLLTWKLI I 
LTFFKAASSQLRALYSMYLRKTKSLNKLLYHLFRLMPENPTYAB 
TAVEVPNKDPICrFPTBELQLS IRETTMLP YHIPHLACS VYHMTL 
KDLPAMVRL WWNS S E KRVFN I VDRFTS KYVS S VLS FQB 1 3 S VQT 

STQLFNGMTVKARATTREVMATYTIEDIVIELIIQLPSNYPLGS 
I IVES GKRVG VAVQQWRNWMLQL S T YLTHQNG S IMEGLAL WKNN 
tfDKR FEG VED CMIC FS VIHG FNYS LPKKACRTCKKKFHS A\ CLY 
KWFTSShTKSTCSLCRETFF 




1306 


253 -H 
( 
C 
I 
C 
I 

c 


4AEIAPASPSDIKASVSNGDTTLLCSRJ?QSCGMNEVRQVSLTYpH 

3SPAPSHSLPLQPRSGGSLCPSRAW/PDPHQLFDDTSSAQSRGY 

JAQRAJ^GGIsSYPAASPTPHAAFLADPVSNMAMAYGSSLAAQGKE 

jVDKNIDRFIP ITKLKYYFAVDTMYVGRiCLGLLFFP YLHQDWEV 

?YQQDTPVAPRFDVNAPDLYIPAMAFITYVLVAGLALGTQDRFS 

>DLLGLQASSAIAWLTLEVLAILLSLYLVTVNTDLTTIDLVAFL 

JYKYVGMIGGVLMGLLFGKIGYYLVLGWCCVAI FVFMIRTLRLK 
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ID 

NO: 


1 Predicted 
beginning 
nucleotide 
location 
corrc spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 1 
{A-Alanine, C» Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F«Phenyl alanine, G=Glycine, j 
H«Histidine, I=»Isoleucine, K=Lysine, 
i*=ijeucine, M=netnionine, N=Asparagine, 
P« Proline, QssGlucaraine, R=Arginine, | 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y« Tyrosine, X-Unknown , *«»stop 1 
Codon, /^possible nucleotide deletion, I 
\»possible nucleotide insertion) | 


6043 


403 


599 


I iiADAAAEG VP VRGARNQLRMYLTMAVAAAQ PMLMYWL.T FH LVR | 

LCLFFFFPC^TPVLPIiPSLISAL./CLSHl.SVSSMFCPCQPPirP'cl 
PLPPLQNKTAKGSLSTEQSBRG | 


6044 


793 


412 


KLEMWNFTLISKVKISREVTMIASKFGIGQQVRHSLLGYLGW^ 
DXDPVYSLSEPSPDELAVKDELRAAPWYHWMBDDNGLPVHTYL 

aeaqlsselqdewpXeqpsmdblaqtirkqlqaprlrn J 


6045 


! 155 


2299 


SPLPQVAAMNYLRRRLSDSNFMANIjPNGYMTDLORPQ^PPPPPG 

ahspgatpgpgtataerssgvapaaspaapspgssggggffssl 
skavkqttaaaaatfseqvgggsggagrggaasrvllvidepht 
dwakyfkgkkihgeidi kveqaefs dlnl vahang g fs vdme vh 
rngvkwrs lkpdfvli rqhafsmarngdyrslviglqyagi ps 
vnslhs vynfcdkp wvfaqmvrlhkklgtee fplidqt fypnhk 
BMI>SS \ttypvwkmghgtlwgwgkvkvdnqhdfqdiasvvalt 

KTY ATAE PFIDAKYD VR VQKI GQN YKA YMRTS VSGNWKTNTQSA 
MLEQIAMSDRYKLWVDTCSEIFGGLDICAVEALHGKDGRDHIIE 
WGSSMPL1GDHODEDKQLIVELWNKMA0ALPRQRQRDASPGR 
GSHGQTPS PG ALPLGRQTS QQ PAGP PAQQRPP PQGGPPQ PGPG P 
QRQGPPLQQRPPPQGQQHLSGLGPPAGSPLPORLPSPTSAPQQP 
AS QAAP PTQG QGRQS RPVAGGPGAP PAARPPAS PS PQRQAGPPQ 
ATRQTS VSGPAP PKASGAPPGGQQRQGP PQKPPGPAGPTRQASQ 
AGPVPRTGPP TTQQPRPSGPGPAGRPKPQLAQKPSQDVP PPATA 
AAGGPPHPQLHKSQSLTNAPNIjPEPAPPRPSIiSQDEVKAETIRS 
LRKSFASLFSD 


6046 


212 


; 1075 


egltgpcervpfllgrgpphgatraghrravrwagpeslpplpT} 

SLIMDSPRAGTHQGPLDAETEVGADRCTSTAyQEQRPQVEQVGK 
QAPLS PGLPAMGGPGPGPCBDPAGAGGAGAGaSEPLVTVTVQCA 

ftvalrarrgadlsslrallgqalphqXaqi^qlsylapgedgh 

WVPIPE2E3LQRAWQDAAACPRGLQLQCRGAGGRPVLYQWAQH 
SYSA0<3PEDLGFRQGI>TVDVLCEVDQAWLEGHCDGRIGIFPKCF 
WPAGPRMSGAPGRLPRSQQGDQP 


6047 


49 


1405 


PVLVTSIiRMREADTLRPPQUlEVSADI I STVE FNHTGELLATGD J 

KGGRWIFQREPESKNAPHSQGEYDVYSTFQSHEPEPDYLKSLE 

IEBKINKIKWLPQQNAAHSLLSTNDKTIKLWKITERDKRPEGYN 

LKDBEGKLKDLSTVTSLQVP VLKPMDLMVEVS PRRI FANGHTYH 

INSISVNSDCETYMSADDIjRINLWHLAITDRSFTP\NIVDIKPA 

NMEDLTEVITA3EFHPHHCNLFVYSSSKGSLRLCDMRAAALCDK 

HSKI»FEEPEDPSNRSFFSEIIS\SVSDVKFSHSDRYMLTR\DYL 

TVKVWDlt\NMEARP I ETYQVHDYLRS KLCSL YBNDCI FDKFE CA 

WNGSDSVIMTGA\YNNFFRMFDRNTKRDVTL\EASRESSKPRAV 

LKPRRVCVGGKRRRDDI SVDSLDFTKJCIliHTAWHPAENI 1AXAA 

TNNLYI FQDKVNSDMH 


6048 


1 


3194 


GIRTP KFCDSPTS DLEMRNGRGRGKRMRPNSNTPVWETATASDS"H 
KGTSNSSKXRAGANSKGRRGSQNSSEHRPPASSTS EDVKAS PSS 
ANKRKNKPLSDMELNS SSEDSKGS KRVRTNSMGS ATG PL PGTKV 
EPTVLDRNCPS P VL ID CPH PNCNKKYKHINGLKYHQAHAHTDDD 
S KPEADGDS EYGEEPILHADLGSCNG \ ASVSQK \ GSLS PARSAT 
PKVRLVEPHSPSPSSKFSTKGLCKKKIiSGBGDTDLGAliSNDGSD 
DGPS VMDETSNDAFDS LERKCMEKEKCKKP S SLKPEK I PSKS LrfC 
SARPI/APLAIPPQQIYTFQTATFTAASPGSSSGLTATVAQAMP 
NSPQLKPIQPKPTVMGEPFTVNPALTPAKDKKKKDKKKKESSKE I 
IjESPLTPGKVCRAEEGKSPFRESSGNGMKMEGLLNGSSDPHQSR 
lasikaeadkiysftdnapspsiggqqrt.pmttptndt tdtinm 

tq>igaeassvktnspaysdisdagedgegkvdsvkskdaeqlvk 
egakktbfppqpqskdspyyqgfesyyspsyaqsspgalnpssq 
agvesqalktkrdeepesiegkvkndiceekkpelsssscxjpsv 

IQQRPKMYMQSLYYNQYAYVPPYGYSDQSYHTHLLSTNTAYRQQ 

yeeqqkrqsleqqqrgvdkkaemglkereaalkeewkqkpsipp 
tltkapsltdlvksgpgkakepgadpaksvi i pklddssklpgq 
apeglkvklsdashlskeas eaktgaecgrqaemdp ilmyrqea 

EPRMWTYVYPAKYSDIKSEDERWKEERDRKLKEERSRSKDSVPK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal p ep tidT~ 
(AaAlanine, CaCysteine. DnAqnprt-^ t-ix « 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, Methionine, N=Asparagine, 
P=Proline, Q«Glut amine, RaArginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, YeTyrosine, X=Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EDGKESTSSDCKLPTSEBSRLQSKEPRPSVHVPVSSPLTQHOSY" 
I P YMHG YS YSQS YD PNHPS YRS MPAVMMQNYPGS YLPS S YS FS P 

YGSKVSGGEDADKARASPSVTCKSSSBSXALDILQQHASHYKSK 
S PTIS DKTSQERDRGGOGVVGGGGSCSS VGGASGGERS VDR PRT 
SPSQRLMSTHHHHHHLGYSLLPAQYNLPYAAGLS STiv mcrwi 
STPSLYPPPRR 


6049 


215 


1089 


AMTGVFDRR VPS IRSGDFOAPPQTSAAMHHPSQESPTLiPESSAT 
DS3YYSPTGGAPHGYCSPTSA3YG\KALNPYQYQYHGVNGSAGS 

ypakayadysyassyhqyggaynrvpsatnqpekevtepevrmv 
ngkpkkvrkprtiysspqlaalqrrfqktqylalperaelaasl 

GLTOTQVKIWFQNKRSKIKKIMKNGEMPPEHSPSSSDPMACNSP 

qspavwepqgssrslshhphahpptsnqspassylensaswyts 

AASSINSHLPPpGSLQHPLALASGTLY 


6050 


566 


j" i^ie 


«wtu»i\ivwnjoDouoflM Aci\xiXlijv3^KMUPPLGEPG\GSIjGWVi* 
PNTAMKKKVXiLMGKSGSGKTSMRS 1 I PANYIARDTRRLGATILD 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTS 0RDNI FRNVEVL I YVFDVESRELEKDMH Y 
YQS CLEAI LQNSPDAKI FCLVHKMDLVQEDQRDL1PKBREEDLR 
RLSRP LECSCFRTS I WDETLYKAWS 5 IVYQLIPNVQQLEMNIjRN 
FAEIIEADEVLLPERATFLVISHYQCKEQRDAHRFEKISNIIKQ 
FKLSCSKIAASFQSMEVRNSNFAAFIDIFTSNTYVMWMSDPSI 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6051 
6052 


5^6 


1718 


* ^->_-f-vi liijidujoi^i x Ctr^EiEi±jtj~ KMDfFJWsiiPG\GSLGWVL 
PNTAMKKKVXiLMGKSGSGKTSMRS 1 1 PANYIARDTRRLGATILD 
R I HSLQ INS S LSTYSLVDS VGJnTCTFDVBHSHVRFXX3NLVLNLW 
DCGGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESRELEKDMHY 
YQS CLEAI LQNS P DAKI FCLVHKMDLVQEDQRDLI FKEREED LR 

RLSRPLECSCFRTSIWDETLYKAWSSIVYQLIPNVQQLEMNLRN 
FAEIIEADEVLLFERATFLVTSHYQCKEQRDAHRFEKISNI I KQ 
FKLSCSKLAASFQSMEVRNSNFAAFIDIFTSNTYWVVMSDPSI 
PSAATIjINIRNARKHFEKLERVDGPKQCLIiMR 




566 


1718 

r 


«w«mm*» wwvuaagi/gcxvi 1 EtlULIMljVJi'XroUi'k'iitiiJli'Ca^GSLGWVL 

PNTAMKKKVLLMGXSGSGKTSMRSIIFANyiARDTRRLGATILD - 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDN I FRNVEVLI YVPDVESRELEKDMHY 
YQS CLEAI LQNS PDAKI FCLVHKMDLVQ EDQRDL I FKEREEDLR 
RLSRPLECSCFRTS IWDBTLYKAWSS I VYQLIPNVQQLEMNLRN 
FAB I IEADEVLLFERATFLVXSHYQCKEQRDAHRFEKISNI I KQ 
FKLSCSKLAASFQSMEVRNSNFAAFIDI PTSNTYVMWMSDPSI 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6053 


201 


1704 


KGTEMNKSRWQSRRRHGRRSHQ^NPWFRLRDSEDRSDSRAAQPA 
HDSGHGDDBSPSTSSGTAGTSSVPELPGFYFDPEKKRYPRLLPG 
HNNCNPLTKESIRQKEMESKRLRLLQEEDRRKKIARMGFNASSM 
LRKSQLGFLNVTNYCHLAHELRLSCMER KKVnTR «?mtid cnr » C n 
R FNL I LADTNSDRL FTVND VTVGGS KYG I INLQSLKTPTLKVFM 
HENLYFTNR KV\NS VCWAS LNHLDSHI LLCLMGLAETPGCATLL 
PASLFVNSHPAGIDRPG\MLCSFRIPGAWSCAWSLNIQANNCFS 
TGLSRRVI&TNVVTGHRQS FGTNSDVLAQQ FAIWAPLLFNG CR S 
G 21 FAI D LRCGNQG KG WKATRL FHDSAVTS VRILQD EQ YLMAS D 
MAGKIKLMDLRTTKCVRQYEGHVNEYAYLPLHVHEEEGrLVAVG 
QDCYTRIWSLHDARLLRTIPSPYPASKADIPSWSSRLGGSRG 
APGLLMAVGQDLYCYSYS 


6054 


i 


1054 

( 

1 
1 


PPIARI^EFGTSRRHMAAPSGVHLLVRRGSHRIFSSPIJilHl^H- 

KQSSSQQRRNFFFRRQRDISHSIVLPAAVSSAHPVPKHIKKPDY 

^HTOIVPDWGDSIBVKNEDQIQGLHQACQLARHVLLLAGKSLKV 

dr-itteeidalvhrei ishnayps plgyggfpksvctsvnnvlch 
sipdsrplqdgdiinidvtvyyngyhgdtsetflvgnvdecgkk 
livevarrcrdeaiaacragapfsvigntishithqngfqvcphf 
/ghgigsyfhghpeiwhhandsdlpmeegmaftiepiitegspb 

^KVLEDAWTVVSLD/TSKVSAQFEHrVLITSRGAQILTKLPHEA 



437 



WO 01/53312 



PCT/USOO/34263 



SEQ 

xu 
NO: 

" 605S 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


I Predicted end 
nucleotide 

1 location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

1 s&qaence 


1 Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H»Histidine, I-Isoleucine, KoLysine, 
L«Leucine, MeMethionine, N»Asparagine , 
P-Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, VaValine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /=»possible nucleotide deletion, 

| \=possible nucleotide insertion) 




421 


1 23^4 


' PPYPLI^FUWWLVGQ^bRTETDISQSAGPPeGTIiQCSJUaHDp-" 
GCANCSRFCRDCSPPACQCHTHVPPGmiJIGVQPPELSRTLAXI 
SSREPPRKKKKSQTETGKERERTS PLTQGGKRFELQHGLAG I CM 
TLLITGDS IVSABAVWDHVTMANRELAPKAGDVIKVLDASNKDW 
WWGQ I DDE BGW FP AS FVRLPTVNHBDEVEEGPS DVQKGRLDPNSD 
CL CLGRPLQNRDQMRAUVI NB IMSTERHY I KHLKDI CEG YLKQC 
R KRRDMFS DEQLKVI PGN I EDI YRPQMGFVRDLBKQYNNDDPHL 
SErGPCPLEHQDGFWiySEYCNNHLDACMELSKLMKDSRYQHFF 
EACRLLQQMI D I A\ I DGFLLTPVQKI CKYPLQLAELLKYTAQDH 
SDYRWAAALAVMRNVTQQINERK^LENIDKIAQWQASVLDWE 
GEDILDRSSELIYTGEMAWIYQP\YGRNQQRVFFLFDHQMVLCK 
KDLIRRDILYYKGRID>IDKYEVVDIEDGRDDDFNVSMKNAFKLH 
NKETEEIHLFFAKKLEEKIRWLRAFREERKMVQEDEKIGFEISE 
NQKRQAAMTVRKVPKQKGVNSAKSVPPSYPPPQDPLNHGQYLVP 
\ DG I AQSQVFE FTEPKRSQS PFWQNFS RLTPFKK 


6056 - 


43 


i 3358 

r 


SGGRGPVRVRSEQLSPSAEQVSQlSQISltiGWlPLSSliPPPPSRA 
LAPTRAPDTALTIMEVAEVESPLNPSCKIMTFRPSMEEFREFNK 
YLA YMES KGAHRAGLAKVI P PKEWKPRQCYDD IDNLL I PAP I QQ 
MVTGQSGLFTQYNIQKKAMTVKEFRQLANSGKYCTPRYLDYEDL 
ERKYWKNLTFVAPIYGADINGSIYDEGVDEWNIARLNTVLDVVE 
EECGISIEGVNTPYLYFGW^KTTFAWHTEDMDLYSINYLHFGEP 
K5WYAI?PEHGiO^ElUAO^FF^SSSOX3CDAFLPJDCMTLISPSV 
LK KYGI PFDKITQEAGE FM I TFP YG YHAGFNHG FNCAES TNPAT 
VRW I D YGKVAXLCTCRKDMVKISMD I FVRKFQPDRYQLWKQGKD 
IYTIDHTKPTPASTPEVKAWLQRRRiCVRKASRSFQCARSTSKRP 
FCADBEEEVSDEVDGAEVPNPDSVTDDLKVSEKSEAAVKLRWTEA 
SSEEESSASRMQVEQNLSDHIKLSGNSCIiSTSVTEDIKTEDDKA 
YAYRSVPSISSEADDSIPLSTGYEKPEKSDPSELSWPKSPESCS 
SVAESNGVLTEGEESDVESHGNGLEPGEIPAVPSGERNS FKVPS 
IAEGENKTSKSWRHPLSRPPARSPMTLVKQQAPSDEELPEVLSI 
EEEVEETESWAKPLIHLWQTKPPNPAAEQBYNATVARMKPHCAI 
CTLLMPYHKPDSSNEEOTARWETKLDEVVTSEGKTKPLI PEMCF 
I YSEHNI EYSPPNAPLEEDGTS LL I SCAKCCVRVHAS CYG I PSH 
E ICDGWLC^CKRNAWTAECC^CNLRGGAIiKQTKNNXWAHVMCA 
VAVPBVRFTNVP ERTQIDVGRI PLQRIiKLKCI FCRHR VKRVS G A 
CI QCSYGRCPASFHVTCAHAAGVL\MEPDDWPYWNITCFRHKV 
NPNVKS KACEKVI S VGCHVITKHRNTRYYSCRVMAVXSQTFYE V 
MFDDGS FSRDTFPEDIVSRDCLKLGPPAEGEWQVKWPDGKLYG 
AKYFGSNIAHMYQVEFEDGSQIAMKREDIYTLDBELPKRVKARF 
VSAGRCHLGTCQVNSLSSPHVSQAQQETYIjGFW INS KKSQCNI F 
LSGTY 


6057 


1 


B53 


FVARLKEQEGEGGIjGPRKEKGRARGRERRRKMQIfTRCCFVFLVQ 

gslylvicgqddgppgsedperddhegqprprvprkrghispks 
rpmanstl1x3liappgeawgiix3qppnrpnhspppsakvkki fg 
wgdfysniktvalnllvtgk1vdhgngtfsvhfqhkatgqgnis 
islvppskavefhqeqqifieakaskifnc\rmewekve\rgrr 

TS L FTHDPAKI CSRDHAQSSATWSCSQP FKWCVY IAF YSTD YR 

lvqkvcpdynyhsdtpyypsg 


605B 
6059 


i 




HPLPSASLGLPSVSLGVSLCVRSAIiLEAWPMIiPKRRRARVGSP 
SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGLARSKGFR 
VLDACSS EATH WMEETS AE8AVS WQERRMAAAP PGCTPPALLD 
I SWLTES I^AGQPVP VECJIHRLEVAGPSKGPLS PAWMPAYACQR 
eivui hhxv it, USEAL E ILAEAAGFEGSBGRLLTFCRAASVIiRAL 
PSPVTTLSQLQGLPHFGEHSSRVVQELLEHGVCEEVERVRRSE/ 
RLFTQIFGVGVKTADRWYREGLRTLDDLREQPQKLTQQQKAGBP 
SREAG PWASLNCTLD P SAS TP 




2 


3650 

1 


QQDFSSLADLTDHRAHRCPGDGDDDPQLSWVASSPSSKDVASPT 
3MIGIX5CDLGLGEEEGGTGLPYPCYJFaDKSFIJU*SYLKRHEQlH 
SDKLP FKCTYCSRLFKHKRSRDRH i klhtgdkkyhche ceaafs 
RSDHLKIHLKT^SSKPFKCTVCKRGFSSTSSIXJSHMQAHKKNK 

emijucsekeakkddfmcdycbdtfsqteelekhvltrhpqlsek 
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ID 

NO: 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of • 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, OCyeteine, D=Aspartic Acid, E» 
Glutamic Acid, Fa Phenylalanine, G=Glycine, 
H=Histidine, Ialsoleuciue, K«LyBine, 
LaLeucine, M=Methionine, N-Asparagine, 
P=Proline, Q«Glutamine, R^Arginine, 
S«Serine, T=Threonine, V=Valine, 
^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADLQCIHCPEVPVDENTt*LAHIHQAHAMQkriKCPMCPB\QFSSV 
\EGVYCHLDSHRQPDSSNHSVSPDPVLGSVASMSSATPDSSASV 
BRGSTPDSTLKPLRGOKKMRDDGQGWTKWySCPYCSKRDFNSL 
AVLEIHLKTIHADKPQQSHTCQICLDSMPTLYNIiNEHVRKIiHKN 
HAYP VMQFGNI S AFHCNYCPEMFAD I NSLQEHI RVSHCGPNANP 
SDGNNAFFCNQCSMGFLTESSIiTEHIQ\Q\AHCSVGSAKLBSPV 
VQ PTQS FMEVYS CPY CTNS P I FGS I LKLTKH I KBNHKNI P LAHS 
KKSKAEQS PVSSDVEVSS PiCRQRLS ASANS ISNGEYPCNQCDLK 
FSNFESFQTHLKLHLELLLRKQACPQCKEDFDSQESLLQHLTVH 
YMTTSTHYVCESCDKQFSSVDD\LQKJ!\LLDMPHPLCCTHCT\L 
CQEVFDS\KVSI \QVHLAVKHSNE KKMYRCTACNWDFRKEADLQ 
VHVKHSHLGNPAXAHKCI FOGETFSTEVBIiQCHITTHSKKYNCK 
FCSKAFHAI ILLEKHLREKHCVFDAATBNGTANGVPPMATKKAfi 
PADLOGMLLKNPE APNSffK A^FDDVTI AS P PMYrtfDT fYSi A VTMP 
VLLQNHRLRDHN IRPGEDDGSRKKAE FIKGSHKCNVCS RTFFSE 
NGLREHLQTHRG PAKHYMCPICGERFPSLLTLTEHKVTHS KSLD 
TGTCRICKMPLQSEEEFIEHCQMHPDLRNSLTGPRCWCMQTVT 
STLELKIHGTFHMQKLAGSSAASS PNGQGLQKLYKCALCLKEFR 
SKQDLVKLDVNGLPYGIiCAGCMARS ANGQVGGLAPP EP ADR PCA 
GLRCPECSVKFESAEDLESHMQVDHRDLTPETSGPRKGTQTSPV 
PRKKTYQCI KCQMTFENB RE I QIHVANHMIBEG INHECKIiCNQM 
FDSPAKLLCHLIEHSFBGMGGTFKCPVCFTVFVQANKLQQHIFA 
VHGQEDKI YDCSQ CPQKF FFQTBLQNHTMSQHAQ 


6060 


2145 


202 

T 


SYE I VGKNKLEVNHSQI.KALCKCSLPSRLLPLGENLPLLDRGFR ~ 
KEPRS RGSRERDNMLHLHHS CLCFRS WLPAMLAVLLSLAPS ASS 
DISASRPNlIJJiMADDLGIGDIGCYGNNTl^TPWIDRLAEDGVK 
LTQKISAASLCTPSRAAFLTGRYFVRSGMVSSIGYRVLQWTGAS 
GGLPTNETTFAKI LEEKG YATGLIGKWHLGLNCESASDHOiHPL 

HH^FDHFYGMPFSlJ4GDC!ARWPT»SEKR\/MI J pnTCT.N1?T.W , nVT A~ \T 
ALTLVAGKLTHL I PVSWMP VI WSALSAVLLLASS YFVGALI VHA 
DCFLMRKHTITEQPMCFQRTTPLILQEVASFLKRNKHGPFLLPV 
S FLHVH I PLITMEN FLGKS LHGLYGDNVKEMDWMVGRILDTLD V 
EGLSNSTLIYFTSDHGGSLENQI^GNTQYGGWNGIYKGGKGMGGW 
EGGlRVPGIFRWPGVLPAGRVIGEPTSLMDVFPTVVRIiAGSEVP 
QDRVIDGQDLLPLliLGTAQHSDHEFLMHYGERFLHAARWHQRDR 
GTMWKVHFVTPVFQ?EGAGACYGRKVCPCFGEKVVHHDPPLLFD 
LSRDPSETHILTPASEPVFYQVMER\VQQAVWBHQRTLSPVPLQ 
LDRLGNI WRPWLQPCCGPFPbCWCLREDDPQ 


6061 


110 


1330 


MNIHMKRlCTTKKTNTB , RMPMr.MT.rY2MP A17P\>vrPT.T.POT?r>ri cpvt ~ 

VHNYPDM EAVPL LLNNVKG EPPEDSLS VDHFQTQTEPVDLS INK 
ARTSPTAVSSSPVSMTASASSPSSTSTSSSSSSRLASSPTVITS 
VSSASSSSTVLTPGPLVASASGVGGO^FLHIIHPVPPSSPMNLQ 
SNKLS HVHR I PVVVQS VpVVYTAVRSPGNVNNTI VVPLLEDGRG 
HGKAQMDPRGLSPRQSKSDSDDDDLPNVTLDSVNBTGSTALS IA 
RAVQBVHPS PVSRVRGNRMNNQKFPCS ISPFSIES TRRQRTVLN 
PPDSRKTAYSTDCDF\EGI*QQKLYTKS ss pgrvhrrthtgekp y 
ZCTWEG CTWKFARSDELTRH YRKHTGVKP FKCADCDRS FSRSDH 
LALHRRRHMLV 




71 


1079 


ETMAKNGPENCEDCH IL5JAEAFKS KKICKSLKICGLVFGI LALT "~ 
LIVLFWGSKHFWPEVPKKAYDMEHTFYSNGEKKKIYMEIDPVTR 
TE I FRSGNGTDETLEVHDFKNG YTGI YFVGLQKCFI KTQI KVI P 
EFSEPEEEIDENEEITTTFFEQSVIWVPAEKPIENRDFLKNSKI 
LEICDNVTMYW\INPTL\ISGTFAKQLHHNFAFTILVSELQDFE 
EEGEDLH FP ANT2KKG I EQNEQWWPQ VKVE KTRHAROASEEBL P 
INDYTENG IEFDPMLDERG YCCI YCRRGNRYCRRVCE PLLGYYP 
YPYCYQGGRVICRVIMPCNWWVARMLGRV 


6063 


71 


1079 


ETMAKNGPEKCEDCHIIiNAEAFKSKKICKSLKICGLVFGILALT 
LIVLFWGSKHFWPEVPKJCAYDMEHTFYSNGEKKKIYMEIDPVTR 
TEI FRSGNGTDETLBVHDFKNGYTG I YFVGLQKCFI KTQI KVI P 
EFSEPEEEIDENEEITTTFFEQSVI WVPABKP I ENRDFLKNSKI 
LBICDNVTMYW\ INPTlA I SGTFAKQLHHNFAF I ILVS ELQDFE 
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location 
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to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(AoAlanine, C«Cysteine, DoAspartic Acid, E«= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L*Leucine, M=Methionine, N=Asparagine, 
P*Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *oStop 
Codon, /=possible nucleotide deletion, 
'-possible nucleotide insertion) 



EEGEDLHFPANEKKGIEQNEQWWPQVKVEKTRHARQASEBELP 
INDYTENGIBFDPMLDERGyCCIYC^GNRYCRRVCEPIjLGyYP 
YP YCYQGGR VTCR VI MPCNWWVARMLGR V 



311 



6065 



1153 



641 



hlpqsi.prptbhsppyslekM^dlvavWdValsdgvhkiepehg 

TTS GKR WYVDGKE E I RKEWMP KL VGKBTF YVGAAKTKATINID 
AISGFAYEYTLEINGKSIJCKYMEDRSKTTNTWVLHMDGENFRIV 
LEKDAMD VWCNG KKLETAGB FVDDGTETHFS IGTH\ACYIKAV\ 
SSG\KRKEGI IHTL IVDNRE I PE IAS 



6066 



68 



3470 



MSVK VAKVAW VRGLGAS YRRGAS S FPV PP PGAQGVABLLRDATG 
AEEEAPWAATERRMPGQCSVLLPPGQGSQWGMGRGLLNYPRW 
BLYAAARRVLGYDLLELSLHGPQETLDRTVHCQPAIFVASLAAV 
EKLHHLQPSVIBNCVAAAGFSVGEFAALVFAGAMEFAEG 



"60TT 



858 



321 



VKENMPATRKPMRYGHTEGHTEVCFDDSGSFIVTCGSDGDVRIW 
BDLDDDDPKFINVGEKAYSCALKSGKLVTAVSNNTIQVHTFPEG 
VPDG I IiTRFTTNANHWFNGDGTKIAAGSSD \ FLVKI VDVMDSS 
QQKTFRGHDAPVfcSLSFDPKDI FLAS AS CDGS VR VWQ I SDQTCA 
ISWPLLQKCNDVTNAKS ICRIAWQPJCSGKLLAIPVEKSVKLYRR 
ES WSHQFDLSDNFISQTLNI VTWS PCGQ YLAAGS ING L 1 1 VWNV 
ETKDCM ERVKHEKGYAI CGLAWHPTCG RIS YTDAEGNLGLLE1NV 
CDPSGXTSSSKVSSRVEKDYNDLFDGDDMSNAGDFIiNDNAVEIP 
SFS KG I INDD EDDEDLMMASGRPRQRS HI LBDDENS VD I S MIiKT 
GSSLLKEEEEDGQEGSlHNIiPLVTSQRPFYDGPMPTPRQKPFQS 
GSTPLHLTHRPMVWNS IGIIRCYNDEQDNAIDVEFHDTSIHHAT 
HLSNTI»NYTIADLSHEAILLACBSTDELASKLHCLHFSSWDSSK 
EWI IDLPQNEDIEAICLGQGWAAAATSALLLRLFTIGGVQKEVF 
SLAGPWSMAGHGEQLFiraiRGTGFBGDQCLGVQLLELGKKKK 
QILHGDPLPLTRKSYIiAWIGFSAEGTPCYVDSEGIVRMLNRGLG 
NTWTPICNTREHCKGKSDHYWVVGIHENPQQLRCIPCKGSRFPP 

TIjPRPAVAILS fklp ycqi atekgqmeeqfwrsvi fhnhld yla 

KNGYEYEE5TKNQATKEQQELLMKMLALSCKLEREFRCVEIjADL 

mtqnavnlaikyasrsrkliijvqklseiavekaaeltatqveee 
eebedfrkklnagysntatewsqprfrwqveenaedsgeaddee 

KPE IHKPGQNS FSKSTNSSDyS AKSGAVT FSSQGRVNP FKVSAS 
S KEP AMSMNSARSTNIIiDNMGKSS KKS TALSRTTNNEKS P 1 1 KP 
LIPKPKPKQASAASYFQKRNSQTNKTEEVKEENIiKNVLSETPAI 
CPPQNTENQRPKTGFQMWIiEENRSNIIiSDNPDFSDEADIIKEGM 
IRFRVLSTEERKVWANKAKGETASEGTEAKKRKRVVDESDETEN 
QEEKAKENLNLSKKQKPLDFSTNQKLSAFAFKQE 



LPWQIOCVLI^RGKMAVTGWIjESLRTAQKTAIJjQDGRRK 
PDGKEMAEEYDEKTSELLVRXWRVKSALGAMGQWQLEVGDPAPL 
GAGNLGPELI KESNANPI FMRKDTKMS FQWRIRNLPYPKDVYSV 
SVDQKERCI I VRTTNKKYYKKPS IPDLDRHQLPLDDALLSFA\T 
PTAP 



13 



1730 



OS KMADItANE EKPAIAP P VFVFQKDKGQKS PAEQKNLS DSGEE p 
RGEAEAPHHGTGHPES AGEHALEPPAPAGAS ASTP P PP APEAQL 
PPFPRELAGRSAGGSSPEGGEDSDREDGNYCPPVKRERTSSLTQ 
FPPSQSEERSSGFRLKPPTLIHGQAPSAGLPSQKPKEQQRSVLR 
PAVLQAPQPKALSQTVPS SGTNG VS LPADCTGAVPAAS PDTAAW 
RSPSEAADEVCAIjEEKEPQKNESSNASEBEACEKKDPATQQAFV 
FGQNLRDRVKLINESVDEADMENAGHPSADTPTATNYFLQYISS 
SLENSTNSADASSNKFVTGQNMSERVl^PPKliNEVSSnANRENA 
AAESGSESSSQEATPEKESLAESAAAYTKATARKCLLEKVEVIT 

geeaesnvlqmqcklfvfdktsqswvergrgllrlndmastddg 
tlqsrlsdagprgslr\lilntklwaqmqidkasek\siritam 

DNEDQGVKVFI, I SAS S KDTGQVYAALHHR I LALRS RVEQEQEAK 
M PAPE PGAAP SNE EDDSDDDDVLAP SGATAAGAGDEGDGQTTGS 



27 



FrRPGQAGSSSAMAAQRIiGKRVLSKLQSPSRARGPGGSPGGLQK 
RHARVTVKYDRRELQRRLDVBKWIDGRIiEBLYRGMEADMPDEIN 
IDELLBLESEEERSRKIQGLLKSCGKPVEDFIQELLAKLQGLHR 
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P^ftdietAri end 

nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


™i*"w ciciti atjyuiciiu concainiiig siynai peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, Fa Phenylalanine, G=Glycine, 
H»Histidine, X»Isoleucine, K=Lysine, 
LoLeucine, M=Methionine, N=Asparagine, 
Poproline, Q*=G1 ut amine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovn, *sStop 
Codon, /"possible nucleotide deletion, 
\=possible nucleotide insertion) 








Q \ PGLRQP5 PS P \DGQPS APFQG P CARTAS P LTLLAL FPG P PER 
RPALLCVLSCI 


6070 


478 


858 


IRVTVDGEFIiHYIPPLQFLDSPBW/RFTETHRGRHFKQVTLTAE 
TDCRYVSWRRKKLYLLPAQHRY ISRLFSVLIGSDIADKLYALND 
RVYIGKRYHYDIRIjPNFYQMSTPEIRRSPLTQHFQNSRRYW 


6071 


2 


1654 


HEARTKGNMALARP\VRLFSLVTRLLLAPRRGI*TVRSPDEPLPV 
VRI PVALQRQLEQRQSRRRNLPRPVI>VRPG PLLVSARR PELNQP 
ARLTLGRWERAPLASQGWKSRRARRDHFSIBRAQQEAPAVRKLS 
o JWJ a r Ai^LAiAW K.P K V IirtALQE \AAPEWQ \ PTTVQSST I PSLLR 
GRHWCAAETCSGKTLSYLLPIjIjQRIjIiG\HPSLDSLPIPAPRGL 
VLV P S R2LAQ Q VRAVAQ P LGRSLG LL VR DLKGGHG MRR I RLQI*S 
RQPSADVLVATPGALWKALKSRLISLEQLSFLVLDEADTLLD2S 
FLELVDYTLEKSHIAEGPADLEDPFNPKAQIiVLVGATFPEGVGQ 
LLNKVAS PDAVTTITSSKLHCIMPHVKQTFLRLKGADKVAELVH 

TT.lfllDno RPOTfTJCHTtfT iTtWToeoittntHryivTT niMii/TAut t*t 
luAruu/iuuK 1 V>r ou 1 V bv c UNoo a I VNWLAj Y XljUOrlK-LQnXiRli 

QGQMPALMRVGX FQSFQKSSRDILLCTDI ASRGLDSTGVELWN 

lUc Jf r i. LfWU I XluUUajC VGRVGSISVPGTv ISFVTHPWDVSIjVQKI 

ELAARRRRSLPGLASSVKEPLPQAT 


""6072 


1 


742 


KMERTEMMPTINSQLEPKSKPFPLVSSSRWLVKRGkL!rAYVEDT 
VLFSRRTSKOQVYFFLFNDVLIITKKKSEESYNVNDYSLRDQLL 
VESCDNEELNSSPGKNSSTMIiYSRQSSASHLPTLTVLSNHANEK 
VEMLLGAETQSERARW ITALGHSSGKPPADRTSLTQVE I VRS FT 
AKQPDELSLQVADWIil \ YQRVSDGWYEGER\LRDGERGWFPME 
CAKE I TCQAT IDKNVERMGRIiLGLBTNV 


6073 


620 


oFn 

DDU 


PCRRGLAK PXjSRRPG/ S I It VHCAVG VSRS ATLVLA YLMIj YHHLT 
LVEAIKKVKDHRGI IPNKGFLRQLLALDRRLRQGLEA 


6074 


16B 


1110 

r 


pgarcmatelqcpdsmpcrinqqvnsastpspeqlrpgdlildha 
ggnrasrakvilltgyahsslpaeijdsgacggsslnsegnsgsg 
dsssydapag^sfledcelsrqigaqlkllpmndqirelqtiir 
dktasrgdpmfsadrlirlweeglnql pykecmvttptgykye 

GVKFEKGNCGVSIMRSGEAMEQGIiRDCCRSIRIGKIIiIQSDEET 
QRAKVYYAKFPPDIYRRKVIJjMYPILQTGXNTVIEAVKVTjI ehg 
VQPSVI ILLSLFS7PH0AKS I IQEFPEITI LTTEVHPVAPTHFG 
QKYFGTD 


" "6075 " 


320 


1091 


PPTC£PQEVi^\YGYVPIIX3NKT^ 

KLGPE I ERAECT I RMNDAPTTC YS ADVGNXTTYR VVAHS S VFR V 
LRRPQEFVNRTPETVFIFWGPPSKMQKPQGSLVRVIQRAGLVFP 
NMEAYAVSPGRMRQFDDLFRGETGKDREKSHSWLSTGWFTMVIA 
VELCDHVHVYGMVPPNYCSQRPRLQRMPYHYYEP KG PD ECVTYI 
QNEKSRKGNHHRFITEKRVFSS WAQLYG ITFSHPSWT 


6076 


1721 


107 


HPS PTEAPR VQHLTMDCTWR I L FLVAAATGTHAQVQL VQSGAE V " 
KKPGASVKVSCKVSGYTLTELSMHMVRQAPGKGLEWMGAFDPED 
GET I YAQKFQGRVTMTE DTSTDTAYMELSSLRS BDTAVYYCATD 

n\3U in£ lsj. nuyu j n v i voo/ii* 1 J\Atrl/vr r X laLrUKnrlvUNoP V V 

LACLITGYHPTSV\TVTWYMGTQSQA\QRTFPEIQRRDSYYMTS 
SQLSTPLQQNRQGEYKCWQHTASKSKKEI FRWPESP KAQASS V 
PTAQPQAEGSIiAKATTAPATTRNTGRGGEBKKKEKEKEEQEERE 
TBCTPECPSHTQPLGVYLLTPAVQDLMLRDKATFTCPWGSDLKD 
AHLTWEVAGKVPTGGVE EGLLE RHS NGSQ S QHS RLTL PRS L WNA 
GTSVTCTWJHPSLPPQRWIALREPAAQAPVKLSLNliLASSDPPE 
A\ASWLLCEVSGFSPPNILLMWLEDH3EVNTSGFAPARPLPKP\ * 
RSTTFWA\WSVLRVPAPPSPQPATYTCVVSHEDSRTLLNA5RSL 
BVSYVTDHGPMK 


6077 


3687 


1268 


LLPDMNLQPI FWIGLISS VCCVFAQTDENRCLKANAKS CGBCIQ ' " 
AGPNCGWCTNS TFLQEGMPTS ARCDDLEALKKKG CP PDD I ENPR 
GSKDI KKNKNVTNRSKGTAEKLKPEDITQIQPQQLVLRLRSGEP 
QTFTLKFKRAEDYPIDLYYLM\DLSYSMKDDLENVKSLGTDLMN 
E^irSDFRrGFGSFVEKTVMPYISrrPAKLRNPCTSEQNCTS 
P F3 YKNVLS LTNKGBVFNELVGKQRISGNIiD S P EGGFDAIMQ VA 
VOGS LI GWRMVTRLLVFS TDAGFHFAGDGKLGG I VLPNDGQCHL 
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beginning 
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j sequence 


Predicted end 
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amino acid 
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sequence 


Amino acid segment containing signal peptide* 
(AaAlanine, C=Cysteine, DoAsoartic Acid p= 
Glutamic Acid, F-Phenylalanine , G=Glycine, 
HoHistidine, I«Isoleucine, K=Lysine, 
L=Laucine, M=»Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine , 
S*= Serine. T=Threonine, V=Valine, 
W=Tryptophan, ^Tyrosine, X= Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


f 6078 






ENNMYTMSHYYDYPSIAHLVQKLSENNIQTIFAVTEEFQPVYKE" 
LKNLIPKSAV0TLSAMSSNVIQLIIDAYNSLSSEV1LENGKLSB 
GVTISYQ5y\CKNGVNGTGENGRKCSNISIGDEV0FElSITSNK 
CPKKDSDSFKIRPLGFTEEVEVI LQYI CECECOSEG I PPqp vrv 
EGNGTFE CGACRCNEGRVGRHC ECSTDE VNS EDIGC FTARKENQ 
FQKSASNHGRVPSAGQCVCRKRDNTNBIYSGKFCECDNFNCJDRS 
NGL»I CGGNGVCKCRVCECNPNYTG SACD CSLDTSTCEASNGQ I C 
NGRGICECGVCKCTDPKFQGQTCEMCQTCLGVCAEHKECVQCRA 
FNXG EKKDTCTQECS YFNI TKVE SRDKL PQPVQPDP VS HCKEKD 
VDDCWFYFTYSVNGNNEVMVHVVENPBCPTGPDI IP IVAGWAG 
I VL I GLALLLI WKLLM I IHDRRE FAKFE KEKMKAKWDTGENP I Y 
KSAVTTWNPKYBGK 


§079 


1426 


1B0 


BTEDVMEL^EEDLTCPICCSliFDDPRVLPCSHNFCKKCLEGILE 

GSVRNSLWRPVPFKCPTPRWTPQVKTPT TOT ^ttuvct ffntimw 

NKI XIS PKMP VCKGH \ LGQPLNI F\ CL\ TDMQLOL/CG I C\ ATR 
GBHTKHVPCS I E D A YAQERDAFES LFQS FETWRRGDALS RliDTL 
BTS KRKSLQLIiTKDSDKVKEFFEKIiQHTLDQKKNEILS DFETMK 
LA VMQA YD P E INKLNT I LQEQRMA PNIAEAFKDVSEPI VFLQQM 
QEFREKIKVI KETPLPPSNLPASPLMKNFDTSQWEDI KLVDVDK 
LSLPQDTGTFIS KI P WS FYKLFLL I LLLGLVI VFGPTMFLEWSI* 
FDDLATVJ KG CLSNFSSYLTKTADFIEQSVFYWEQVTDGFFIFNE 
RFKNFTLWLNNVAEFVCKYKLL 




15B6 


141 


ATARDU3CARRIDRWMESTPSRGLNRVHLQCRNLQEFLGGr>SP 
GVLDRLYGHPATCLAVFRELPSLAKNWVMRMLFLEQPLPQAAVA 
LWVKKE FS KAQEE STGLLSGLR I WHTQLL PGGLQGLI LNP I FRQ 

nuunuyjuu AAn &UU A oUi^ri^iaiAKIWPSIjDKYJUsERWEVVL 

HFMVGSPSAAVSQDLAQLLSQAGLMKSTEPGEPPCITSAGFQFL 
LLDTPAQLWYFMLQYI/>TAQSRGMDLVEILSFLFQLSFSTLGiCD 
YS VEGMS DSLLNFLQKLREFGL VFQRKRKSRR YYPT/RALAINL 
SSGVSGAGGTVHQPGFIV\VETNYRLYAYTESELQIALIALFSE 
MLYPFP \KMW\ARVTR\ESVQ0AIASQITAQQI IHFLRTRAHP 
VT^KQTPVLPPTITDQIR.LWELERDRLRFTEGVLYNQFLSQVDF 
ELL \ LAHAPKLGVLVFE /NTPAKRLM WTPAG.HSDVKRFWKRQK 
HSS 


6080 


1 


1199 


IETI DHVGEPAMAAQAAGVSRQRAATC^IjGSNQNALK^ 

PTELC PS PQF I VGGATRTD I CQGG LGD C WLLAAI AS LTLNE EIjL 
YRWPRDQDFQENYAQ I FH FQPLCP PS ? \ FWQYGEWVE WIDDR 
LPTKNGQLLFLHSEC^NEFWSAIiLEKAYAKLNGCYEALAGGSTV 
EGFBDFTGGI SEFYDLKKPPANL YQI IRKALCAGS LU3 CS ID VY 
SAAEAEAITSQKLVTCSHAYSVU-GVEEWFQGHPEKLrRLRNPWG 
EVEKSGAWSDnAPEWKmiDPRRKEEUDKKVEDaEFWMSLSDFVR 
QFSRLElCNLSPDSLSSEEVHKWNLVliFNGHVJTRnqTanrpnTsjv 
PGSS 


! 6061 


3 


865 


EMLPIoLLPLPLLWA/ GALAQDARFRLEMPESVTVQEGLCI fvhc 
SVFYLEYGWKDSTPAYGHWFREGVSVDQETPVATNNSTQKVQKE 
TQGRFHLLGDPSRNNCSLS IRDARRRDNGS YFFWVARGRTKFS Y 
KYSPLSVYVTALTHRPDILIPEFLKSGHPSNLTCSVPWVCEQGT 
PPIFSWMSAAPTSLGPRTLHSSVLTIIPRPQDHGTNLICQVTFP 
GAGVTTERTIQLSVSWKSGTVEEVVVLAVGWAVKIIXLCLCIjI 
I LSFHKKKAVRAVE VEENVYAVMG 


j 6082 
j 60B3 


283 
1865 


1288 

309 j 


EARSPGPTQTRTAPGLAAPGLAQPAALRLLLSRPPSAAMDGDGD 
PESVGQPEEASPEEQPEEASAEEERPEDQQEEEAAAAA\Y\LDB 
LPEPLI^/LRVLAALPRHE\LVQACR\L.VCLRWKELVDGAPLWIi 
LKCMEGLVPEGGVEEBRDHWQQFYFLSJCRRRNLIJOTPCGEEDL 
EGWaDVEHGGDGWRVEELPGDSGVEFTHDESVKKYFASSFEWCR 
KAQ VXDLQAEGYWE ELLDTTQPAI WKDW YSGRS DAGCLYELTV 
KLLSEHBNVLABFSSGQVAVPQDSDGGGWMEISHTFTDYGPGVR 
PVRFEHG GQDS VYVf KG W FGAR VTNS S VWVE P 

fCQWCAERRGLGMSIJUJEIjLADLEEAAEEEEGGSYGEEEEEPAIE ' 
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Amino acid segmenc containing signal peptide - " 
(A=Alanine, C=Cysteine, D*Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, MoMethionine, N-Asparagine , 
P-Proline, Q«Glutaraine , R=Arginine, 
S-Serine, ^Threonine, VnValine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6084 ~ 






uvuiSETgUH^GDSVKTIAKLWTSkMFAElMMKlBBYISKQAKA" 
SEVMGPVEAAPEYRVIVDAIWLTVEIENELNIIHXFIRDKYSKR 
FPBLES LVPNALDYI RTVKELGNS LDKCKNNENLQQ I LTNATI M 
WSWASTTQGOQLSEEBLERLEEACDMALELNASKHRIYEYVE 
3 RMSFIAPNLS 1 1 1 GASTAAKI MGVAGGLTNLS KMPACNIMLLG 
AQRKTLSGPSSTSVLPHTGYIYHSDIVQSLPPIPPPPSVAP\DL 
RRKAARLVAAKCTLAARVDS FHESTEGKVGYELKDE IERKFDKW 
QEPP P VKQVKPLPAPLDGQR KKRGGRRYRKMKERLGLTE IR \ KQ 
AN RMS FGE I EEDA YQE DLG FS LGHLG KS G S GR VRQTQVNEATKA 
RISKTLQRTLQKQSWYGGKSTIRDR5SQTA3SVAFTPLQGLEI 
VNPQAAEKKVAEANQKYFSSMAEFLKVKGEKSGLMST 




1865 


309 


KQWCAERRGIX3MSIJU3EIiJUAi5I^EAAEEEEGGSYGEE^EEPAlE 
D VQEETQLDLSGDS VKTIAKLWDS KMFAE I MMKI EEYI S KQAKA 
S EVMG PVEAAPE YRVI VDANNLTVE I ENELNI IHKFIRDKYSKR 
PPELESLVPWAU)YIRTVKELGNSLDKCKNNENLQQlLTNATi M 
WS VTASTTQGQQLSE EELERLEEACDMALELNAS KHR I YE YVE 
SRMSFIAPNLS 1 1 IGASTAAKI MGVAGGLTNLS KMPACNTMLLG 
AQRKTLSGPSSTSVLPHTGYIYHSDIVQSLPPIPPPFSVAP\DL 
RRKAARLVAAKCTLAARVDSFHESTEGKVGYELKDE IERKFDKW 
QEPPPVKQVKPLPAPIJX5QRKKRGGRRYRKMKERLGLTEIR\KQ 
ANRMS FGE1 EEDAYQBDLGFSLGHLGKSGSGRVRQTQ VNEATKA 
RISKTLQRTLQKQSWYGGKSTIRDRSSGTASSVAFTPLQGLEI 
VNPQAAEKKVAEANQKYFSSMAEFLICVKGEKSGLMST 


6085 


2 


1456 


SGPRSFQGNRAVGRJSIiGGKRKPEVTLLPGVSSERVRRWRRARV " 
GVAR VKPGNP WKPSPATQVPR/VPAQVYLPGRG PPLREGEEL VM 
DEEAYVLYhT^TGAPCLSFDIVRDHLGDNRTELPLTLYLCAGT 
OAESAQSNRLMMLRMHNLHGTKPPPSEGSDEEEEEEDEBDEEER 
KPQ LE LAMVPHYGG INRVRVS WLG E E P VAGVW S E KG QVEVFAIjR 
RI^QVVSEPQALAAFLRDEQAOMKPIFSFAGHMGEGFALDWSPR 
VTGRLLTGDCQKNIHLWTPTDGGSWHVDQRPFVGHTRSVEDLQW 
SPTENTVFASCSADASIRIWOIRAAPSKACMLTTATAHDGDWV 
ISWSRREPFLLSGGDDGALKI WDLRQFKSGS PVATFKQHVAPVT 
SVEWHPQDSGVFAASGADHQITQWDLG/ IVERDPEAGDVBAD^G 
I»AI)LP(JQLI*FVHQGETELKELHWHPQ CPGLLVS TALSGFTI FRT 
ISV 


b Uoo 
6087 " 


2419 


1357 


GAATQHGGAMNLLPCNPHGW GLbYAGFKQDHGCFACG^lEyGFR V 

YNTDPLKEKEKQEFLEGGVGHVEmFRCNYLALVGGGKKPKYPP 

NKVM I WDDLKKKTVI EIEFSTEVKAVKLRR\DKIWVLDSMI KV 

FTFTHN P \HQLHVPE \TCYNPKGLCVLCPNSNNSLLAFPGTHTG 

HVQLVDLASTEKPPVDI PAKE GVLS CI ALNLQGTR IATASE KGT 

LIRIFDTSSGHLIQELRRGSQAANIYC1NFNQDASLICVSSDHG 

TVHIFAAEDPKRNKQSSLASASFLPKYFSSKWSFSKFQVPSGSP 

CICAFGTEPilAVIAICADGSYYKFLFNPKGECIRDVYAQFLEMT 
DDKL 




476 


1877 


QNSQKTGLPITIFSR^PI^TGSPU^ 

LVAVI YLVS I VVAVPLCVMELQKLEVGIHTKAWFIAGI FLLLT I 
PIS LWV I LQHLVHYTQPELQKP I IRILWMVPI YS LDS W I ALKY P 
GI AI YVDTCRECYB AYVI YNFMG FLTN YLTNRYPNLVL I LEAKD 
QQKHFPPI^CCPPWAMGEVLLFRCmSVLQYTVVRPFTTIVALI 
CELLGIYDEGNPSFSKAWTYLVI INNMSQLPAMYCLLLFYKVLK 
EELSPIQPVGKFLCVKLWFVSFWQAWIALLVKVGVISEKHTW 

EWQTVEAVATGI^DFIICIBMFLAAIANHHYTFSYKPYVQEAEE 
3S C FDS FLAM WDVSD TRDD T .c? pn VP WW2 d t\7~d nnnnvvr Dn „-.- 

DQNEHTS LLS S SSQDA I S IASSMPPSPMGHYQGFGHTVTPQTT P 
ITAKI SDEI LS DTIGEKKEPSDKS VDS 


60B8 


1664 


689 { 

J 
1 
t 

( 

1 


3ASGLVRLLQQQHRCLLAPVAPKLVPPVRGVKKGFRAAFRFQKE 
[iERQRLLRCPPPPVPJ^EKPIWDYHAEIC^FGHRLQENFSLDLL 
^TAFVNS CYIKS EEAKRQQLG IEKEAVLLNL JCSNQELSEQGTS F 
5 QTCLTQ FLEDEYPDM PT EG I KNL VDFLTG EEWCHVARNLA VE 
3LTLSEEFPVPPAVLQQTPFAVIGALLQSSGPERTALFIRDFLI 

cqmigkelfemwkiinpmgllveelkkrnvsapesrltrqsgVa 
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Amino acid segment containing signal peptide 
(A^Alanine, C«Cysteine, D-Aspartic Acid, E«= 
Glutamic Acid, F-Phenyl alanine, G=Glycine, 
H«Hiotidine, I-Isolcucine, K=Lysine, 
L«Leucine, M«Methionine, N=Asparagine , 
P«Proline, QoGlutamine, R=Arginine, 
S=Serine, T»Threonine, Va Valine, 
W=Tryptophan, Y=Tyrosine, XeUnJcnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PTALPLYPVGLYCDKKLIAEGPQETVLVAEEEAARVALRKLYGF 
TENRRPWNYS KPKETLRAEKS ITAS 


6089 


3 


30S4 


TRIX3IPGSriSSRPRLC^AAEGHFLGHSWrGSRAGAHTGAi>AW 
PSRRLRDLPAGGMWRLRRAAVACEVCQS LVKHSSGX KG S LPLQ K 
LHLVSRS I YHSHHPTLXLQRPQLRTSFQQFSSLTNLPLRKLKFS 
P I KYG YQ PRRNFWPARLATRLLKLR YLILGS AVGGGYTAKKT FD 
QWKDMI PDLS EYKW I VPDIVWE I DE YIDFEKI R KALPS S E Dtj VK 
LAPDFDKIVESLSLLKDFFTSGSPEETAFRATDRGSESDKHFRK 
VSDKEKIDQLQEELLHTQLKYQRILERLEKENKELRKIjVLQKDD 
KG I ? FIESLRKS LIDMYSEVLDVLSDYDAS YNTQDRL P RWWG 
DQSAGKTSVIiEMIAQARIFPRGSGEMMTRSPVKVTLSEGPHHVA 
LFKDS SREFDLTKEED1AALRHE I EL RMRKNVKEGCTVS P ETI S 
LNVKGPGLQRMVLVDLPGVINTVTSGMAPDTKETIFS I SKAYMQ 
DPNAXILCIQDGSVDAERSIVTDLVSQMDPHGRRTIFVLTKVTL 
AEKNVASPSRIQQIIEGKLFPMKALGYFAWTGKGNSSES IEAI 
RE YEEEFFQNS KLLKTSMLKAKQVTTRNLSLAVSDCFWKMVRES 
VEOQADSFKATRFNIaETEWRNKYPRLRELDRNELFEKAKNEILD 
B V I SLSQVTP KHWEE I LQQSLWBRVS THVI BKf I YLPAAQTMNSG 
TFNTTVDIKLKQWTDKQLPNKAVEVAWETLQEEFSRFMTEPKGK 
EHDD I FDKLKEAVKEES I KRHKWNDFAEDS LRVI QHNALEDRS I 
SDKQQWDAAIYFMEEALQARIiKDTBNAIENMVGPD\WKKRWLiYW 
KNRTQBQCVHNETKNELEKMIjKCNEEHPAYLASDE ITT VRKNLE 
SRGVEVDPS L I XDTWHQVYRRHFLKTALNHCNL CRRG FY Y YQRH 
FVDSE^COT)VVLFWRIQRMIAITANTLRQQLTNTBVRRLEKNV 

KEVLBI)FAEDGEKiaCIttiLTGKR\^IiAEDIiKKVREIQEKLDAFIE 
ALHQBJC 


6090 


194 


1560 


PVFVPAPGAVIiEQAS/ASPPLATQTWPLQHC^lPELPVQASTL - 
FELQIiFFCQLIAIiFVEIYINIYKTVWWYPPSHPPSHTSLNFHLID 
FNLLMVTT 1 VLGRRFIGS IVKEASQRGKVSLFRS ILLFLTRFTV 
LTATG WS LCRS L IHL FRT YS FT*NLL / FPLLS VWD VHS VPAAE ~LR 
P\RKTSLFNHMASMGPREAVSGLAKSRDYLLTLR\RRGSSTQDS 
CMARTPCP/PHACC^PSLIRSEVBFIJCMDFNWRMKEVLVSSML 
SAYYVAFyPVWFVKNTHYYDKRWSCELFLLVSISTSVIIiMQHI^L 
PASYCDLLHKAAAHLGCWQKVDPALCSNVLQHPWTEECWWPQGV 
LVKKSKNVYXAVGHYNVAI PSDVSHFRFHFFFSKPLRILNILIaL 
LEGAVI VYQLYS LMSSEKWHQTISLALI LFSNYYAFFKLLRDRL 
VLGKAYS YSAS PQRDLDHRFS 


6091 


3279 


412 


SSRTREMEEKEILRRQIRLLQQLiIDDYKTLHaNAPAPGTPAASG " 
WQPPTYHSGRAFSARYPRPSRRGYSSHHGPSWRKKYSLVNRPPG 
PSDFPADHAVRPLHGARGGQP PVPQQHVLBRQVOLSQGQNWI K 
VKPPS KSGSASASGAQRGSIiEE FEDTP WSDQRPREGEGE PPRG Q 
LQPSRPTRARGTCSVEDPLLVCQKEPGKPRMVKSVGSVGDSPRE 
PRRTVSESVIAVKASFPSSALPPRTGVALGRKLGSHSVASCAPQ 
LLGDRRVDAGHTDQP VPSGS VGG PARPAS GPRQAREAS LWTGR 
TNKFR KNNYKWVAAS S KS PR VARRALS PR VAAENVCKAS AGMAN 
KVBKPQLIADPEPKPRKPATSSJCPGSAPSKYKWKASSP3ASSSS 
SFRWQS EAGSKDHASQLSPVIiSRS PSGD\RPAVGHSGLKPLSGE 
TPLSAYKVKSRTKIIRRRGSTSLPGDKKSGTSPAATAKSHLSLR 
RRQAIiRGKSSPVLKKTPNKGLVQVTTHRLCRLPPSRAHLPTKEA 
SSLHAVRTAPTSK VI JCTRYR I VKK1TASPLSAPPFPLSLPSWRA 
RRLSLSRSLVUORLRPVASGGGKAQPGSPWWRSKGYRCIGGVLY 
KVSANKLS KTSGQPSDAGSRPLLRTGRLD PAGSCSRS LASRAVQ 
RSIiAI IRQARQRREKRKE YCMYYNRFGRCNRGERCPY IHDPEKV 
AVCTRFVRGTCKKTDGTCPFSHHVSKEKMPVCSYFIiKGI CSNSN 
CPYSHVYVSRKAEVCSDFLKGYCPLGAKCKKKHTLLCPDFARRG 
ACPRGAQCQLLHRTQKRHS RRAATS PAPGPSDATAR SRVS ASHG 
PRKPSASQRPTRQTPSSAALTAAAVAAPPHCPGGSASPSSSKAS 
SSSSSSSSPPASLDHEAPSUJEAALAAACSNRLCKLPSFISLQS 
S PS PGAQPRVRAPRAPLTKDSGKPLHIKPRL 


6092 


143 


3190 '" 

J 


i^KAPPTGESSEPEAKVLHTKRLYRAVVEAVHRLDLILCNKTAYQ 
5VFXPEN I5LRNKLRELCVKLMFItHPVPYGRJCAEELLWRKVYYE 
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Amino acid segment containing signal peptide 
(A«Alanine, C=*Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, P=Phenylalanine i G=Glycine, 
H=Histidine, I=lsoleucine f K=Lysine, 
L^Leucine, M=Methionine , N^Asparagine, 
P= Proline, Q=Glut amine, R*Arginine, 
SsSerine, T= Threonine, V= Valine, 
WsTryptophan, YeTyrosine, X»Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VlQLIKTrnCKHIHSRSTLECAYRTHLVAGIGFyQHLLLYIQSHY 
QLELQCCIDWTHVTDPLIGCKKPVSASGKEMDWAQMACHRCLVY 
LGDLSRYQNEIiAGVDTELLAERFYYQALSVAPQIGMPFNQLGTL 
AGSKYYNVEAMYCYLRCIQSEVSFEGAYGMLKRLYDKAAKMYHQ 
LKKCETRKLSPGKKRCKDIKRLLVNFMYLQSLLQPKSSSVDSEL 
TSLCQSVLEDFNLCLFYLPSSPNI>SliASEDEEEYESGYAFLPDL 
LI FQMVl ICLMCVHSliBRAGSKQYSAAIAFTIiALFSHLVNHVNl 
RLQAELEEGENPVPAFQSDGTDEPESKBPVEKEEEPDPEPPPVT 
PQVGEGRKSRKFSRLSCLRRRRHPPKVGDDSZ?I*SEGFESDSSHD 
SARASEGSDSGSDKSLEGGGTAFDAETDSEMNSQESRSDLEDME 
BEEGTRSPTLEPPRGRSEAPDSLNGPLGPSEASIASNLQAMSTQ 
MFQTKRCFRLAPTFSNLIjIjQPTTNPHTSASHRPCVNGDVDKPSE 
PAS KEGS ESEGS ES SGR5 CRNERS I QEKLQVLMAEGIjLPAVKVF 
LDWLRTNPDLI IVCAQSSQSLWNRLSVLLNLLPAAGELQESGIA 
LCPEVQDUiEGCELPDLPSSLLLPEDMALRNLPPLRAAHRRFNF 
DTDRPLLSTLEESWRICCIRSFGHFIARLQGSILQFNPEVGIF 
VS IAQS EQESLIiQQAQAQFRMAQBEARRNRLMRDMAQLRLQIiEV 
SQLEGSLQQPKAQSAMSPYLVPDTQALCHHLPVIRQIiATSGRFI 
VIIPRTVIDGLDLLKKBHPGARDGIRYLEAEFKKGNRYIRCQKE 
VGKSFERHKLKRQD ADAWTL YK I LDS CKQLT\ IiAQGAGEED PSG 
MVTI ITGLPIiDNPSLLSGPMQAALQAAAHASVDIKNVLDFYKQW 
KEIG 


6093 


76 


1002 


ACGRRAMLAIiRVART/ SRWGA3U \RGAVWAPGTR PS KRRA CWALL 
P P VPCCLGCLAER WRLRPAALGLRL PG IGQRNHCSGAG KAAPR\ 
PAAGAGAAAEAPGGQWG PASTPSLYBNPWTI PNMLSMTR IGIAP 
VLGYLI IEEDFNlALGVFALAGLTDLliDGFIARNWANQRSALGS 
ALDPIADKI LIS IL YVSLTYADLI PVPLTYM 1 1 SRDVML IAAVF 
YVRYRTLPTPRTLAKYFNPCYATARLKPTFISKVNTAVQLILVA 
ASIAAPVFTnfADSIYLQILWCFTAFTTAASAYSYYHYGRKTVQV 
IKD 


6094 


23 


1010 


P FLRCLRGDQKAKMS ERKVLNKY YPPDFDPS KI PKLKLPKDRQ Y 
WRLMAP FNMRCKT CG E YI Y KG KKFNARKETVQNE VYLGL P I FR 
FYIKCTRCIiAEITFKTDPENTDYTMBHGATRNFQAEKLLEEEEK 
RVQKEREDEELNNPKKVLENRTKDSKLEMEVLENLQELKDLNQR 
QAHVBFEAMLRQHRLSEBERRRQQQBEDEQETAALLEEARKRRL 
LEDSDSEDEAAPSPLQPALRPNPTAIIiDEAPKPKRKVBVWEQSV 
GSLGSRPPLSRLVWKKAKADPDCSNGQPQA/APHPRSPAEQEG 
GQPYTPDAWRVLPEPTGCIPGQ 


6095 


1 


1599 


TRGRAAERSRGRGHGFLGGGFA\SWDYFPSEDFYRCGYCKNES 
GSRSNGMWAHSMTVQDYQDLIDRGWRRSGKYVYKPVMNQTCCPQ 
YTIRCRPLQFQPSKSHKKVLKKMLKFLAKGEVPKGSCE\DEPMD 
STMDDAVAGDFALINKLDIQCDIiKTLSDDIKESLESEGKNSKKE 
EPQELLQSQDFVGEKLGSGEPSHS 



TRADOCS:14 1 6257.1 (%CSH01 1.DOC) 
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Amj.no acid segment containing signal peptiSe 
{A»Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, Islsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine , 
P=Proline, Q«Glutamine, R-Arginine, 
S«Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6096 






VKVHTVP KPGKGADLS KP PCRKAKE 1 RKBRKRli KI>1§qKI P AGEL 
EGFQAQGHPPSLFPPKAKSNQPKSLEDLIFESLPENASHKLEVR 
WRSSPPSSQPKATLLESYQVYKRYQMVIHKNPPDTPTESQFTR 
FLCSSPLEAETPPNGPDCGYGSFHQQYWLDGKIIAVGVIDILPN 
CVSSVYLYYDPDYSFLSLGVYSALREIAPTRQLHEKTSQLSYYY 
MG FY IHS CP KMKYKGQ YRPSDIiLCPSTYVWVP I EQCLPS L ENS K 
j YCRFNQDPEAVDEDRSTE PDRLQVFHKRAI MPYGVYKKQQKDPS 
EEAAVLQ YAS LVGQ KCS ERMLLFRU 


6097 


2277 


575 


QRVRAAIiSSAMEDSEAIXSFEHMGLDPRLi^VTDLGWSRPTLI 
QBKAI PLALEGKDLLARARTGSG KTAAYAI PMLQLLLHRKATGP 
VVEQAVRGLVLVPTKELARQAQSMIQQLATYCARDVRVAKTVSAA 
EDSVSQRAVLMEKPDVWGTPSRILSHiXMDSLKLRDSLELLVV 
DEADLLPSPGFEEEIjKSLLCHLPRIYQAFLMSATFNEDVQALKE 
LILHNPVTliKLQESQLPGPDQLQQFQWCETEEDKFLLLVAIjLK 
LSLIRGKSLL FVNTLERS YRLRLFLEQFS I P TCVLNGBLP LRSR 
CHI ISQFNQGFYDCVIATIlAEVIiGAPVIKSKRRGRGPKGDKASDP 
EAG VAR G I DFHHVS A VLNFDLP P TP EAYIHRAGRTARANNPG I V 
LTPVLPTEQFHLGKIEELLSGENRGPILMYQFRMEEIEGFRYR 

crdamrsvtkqairearlkeikebllhsbklktyfednprVdlq 

LLRHDL P LHPA WKPHLGHVP DYL VP PALRGL VRPHK K\ G RS CL 
PLVGRPREQSPRTHCAASSTKERKSDPQPSPPEWGPIiWS 




1673 


192 

r 


APGTMSGGKKKSSFQITSVTTDYEGPGSPGASDPPTPQPPTdpP 

PRLPNGEPSPDPGGKGTPRNGSPPPGAPSSRFRVVKLPHGLGEP 

YRRGRWTCVDVYERDLEPHSFGGLLEGIRGASGGAGGRSLDSRL 

ELASLGLGAPTPPSGLSQGPTSWLRPPPTSPGPQARSFTGGLGQ 

LWPSKAJCAEKPPLSASSPQQRPPEPETGESAGTSRAATPLPSL 

RVEAEAGGSGARTPPLSRRKAVI^IRIiRMBLGAPEEKGQVPPLDS 

RPSSPALYPTHDASLVHKSPDPFGAVAAQKFSUVHSMLAISGHL 

DSDDDSGSGSLVGIDNKIEQAMDLVKSHLMFAVREEVEVLKEQI 

RELAERNAAliEQBNGLLRALA\SPEQIX3SAGPPRGVPR\LGPPA 

PNGPFVLSLPSLTIVPLGLPGLASAAWPPLPMPALIVPVFPGVG 

VQALSNGPWSPGPLPHLLIIPSLDGGGEGFRTGRQQGAPFGEET 
QPPPSLPGTPQQ 


6098 


168 


1074 


JJ Y. CLRHRSPLEKDSS PGSS STS LLI KKQRETSDTP I MRALKE LD 
EG KI FKWWGTQTEKEDTSN INPRQTETS VNASRS PEKCAQQRQK 
RliNSASQRSSSLPPSNRKSSTPTIO^IMLTPVTVAYSPKRSPKE 
NliSPGFSKii^KNESSPIRFDILLDDLDTVPVSTLQRTNPRKQL 
\ QFIiPLDDSE EK\ TYSEKAT\DNI VNHSS CPEP VPNG VKKVSVR 
TAWEKNKSVSYEQCKPVSVTPQGNDFBYTAKIRTLAETERFF\D 
ELTKEKDQIEAALSRMPSPGGRITLQTRLNQEAFGRS Fftyr> 


6099 
6100 


168 


1074 


WYCU^SPLEKDSSPGSSSTSLLIKKQRETSaiPIMRAinCELD" 
EGKIFKNWGTQTBKEDTSNINPRQrETSVNASRSPEKCAQQRQK 
RLNSASQRS SSLPPSNRKSSTPTKREIMLTPVTVAYS PKRS PKE 
NLSPGFSHLIiSKNESSPIRFDILLDDLDTVPVSTLQRTNPRKQL 

\QFI>PLDDSEEK\TYSEICAT\DNIVNHSSCPEPVPNGVKKVSVR 
TANEKNKSVSYEOCKPVSVTPDGXTDPRVPairTOTT stj-ppbdm ^ 

ELTKEKDQIEAALSRMPSPGGRITLQTRLNQEAFQRSFGKD 


6101 


2 


7i3 : 


FVEVSGYRSRADPEPRGRDT^fTYAYLFKYIIIGDTGVGKSCLLL 
QFTDKRFQPVHDLTIGVEFGARMVNIDGKQIKLQIWDTAGQESF 
RSITRSYYRGAAGALLVYDITRKETFNHLTSWLEHARQHSSSNM 
VIMLIGNK5DLESRRDVKREEGEAFARE\HGLIFMETSAKTACN 
VEEAFINTAKEI YRK1QQGLFDVHNEANGI KIG PQQS I STSVGP 
SASQRNSRD IGSNSGCC 




1 


1399 


PTlGRAWPLREVSHWliGCRRVCS WSAS WGRJL PALS AR LS PLLAFR 
3^1VFPLSCAVQQYAWGKMGSNSEVARIiLASSDPIAQIAEDKPY 
^ELWMGTHPRGDAKI LDNR I SQKT1>S QWIABNQDS LGS KVKDTF 
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SEQ 
ID 
NO: 


| Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid, 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Ba 
Glutamic Acid, F=» Phenylalanine, G=Glycine, 
HaHistidine, I=Isoleucine, K^Lysine, 
LsLeucine, M=Methionine, N»Asparagine, 
P=?roline, Q=Glutamine, R-Arginine, 
S»Serine. TaThreonine V«= \7» "1 1' n 
WnTryptophan, Y^Tyrosine, X«Unknown, *eStop 
Codon, /^possible nucleotide deletion. 
\=possible nucleotide insertion) 








NGNLPFLFJWI^VBTPliSIQAHPNKElAEKLHLQAPQkYPDANH 
KPEMAIALTP FQGLCGFRPVEE I VTFLKKVPEFQFLIGDEAATH 
liXQTMSHDSQAVASSLQSCFSHLMKSEKKVVVBQLNLLVKRISQ 
QAAAGNNMED X FGELLLQLHQQYPGDIGCFAI YFLNLLTLKPGE 
AM FLEANVPHAYLKGDCVECMACS DNTVRAG LTPKF I DVP TLCE 
MLS YTPS SS KDRL FLP TRSQBD P Y LS I YDPP VPDFT IMKA\ EVP 
G \ S VXE YKDliALDSAS I LLMVQGT VIASTPTTQTP I PLQRGGVL 
FIGANESVSLKLTEPKDLLIFRACCLL 


6102 


70 

t 


2415 


QTPQATIiAANGAEDSRGGEMLPAGSIGASPAAPCCSESGDERKN 
LBEKSDINVTVLIGSKQVSEGTDNGDLPSYVSAFIEKEVGNDLK 
SLKKLDKLIEQRTVSKMQLEEQVLTISSEIPKRIRSALKNAEES 
KQPLNQFLEQBTHLFS AINSHLLTAQPWMDDLGTM I SQ I EE IER 
HLAYLKW I S Q IEELSDNI QQYLMTNNVPEAASTLVSMABLD I KL 
QESSCTHLLGFMRAT\nCFWHKILKDKLTSDFEEILAQLHWPFIA 
P PQSQTVGLS RPASAPE I YS YLETLFCQLLKLQTSHELLTEPK\ 
HSQKNTLFLPPLLSS/WPIQVMLTPLQKRFRYHFRGNRQTNVLS 
KPEW YLAQVLMWIGNHTE FLDE KI Q P ILDKVGSLVNARLE FfiRG 
LMMLVLEKLATDI PCXLYDDNLFCHLVDE VLLFERELHS VHG YP 
GTFAS CMH1 LSEETCFQRWLTVERKFALQKMDSMLSSEAAWVSQ 
YKDI TDVDEMKVPDCAETFMTLLIjVITDRYKNLPTASRKIjQFLE 
LQKDLVDDFRIRLTQ VMKEETRASLG FR YCAI LNAVNYT STVLA 
DWADNVFFLQLQQAALEVFAENNTLS klqlgqlasmess vfddm 
inllerlkhdmltrqvdhvfrevkdaaklykkerwlslpsqseq 
avmslsssacpllltlrdhllqleqqlcfslekifwqmlvekld 
vyiyqeiilanhfneggaaqlqfdmtrnlfplfshyckrpenyf 
khikeacivlnlnvgsaltagkdvlpvqlqgsfpat 


*103 


207 


2523 


ESNSTMTTYIiEFIQQNEERDGVRFSWNVWPSSRLEATRMWPVA 
ALFTPLKERPDLPPIQYEPVLCSRTTCRAVLNPLCQVDYRAKLW 
ACNFCYQRNQFPPSYAGISELNQPAELLPQFSSIEYWLRGPQM 
P LI FLYVVDTCMEDEDLQALKESM QM 3 LShh P PTALVGL I TFGR 
MVQVHELGCEGISKSYVFRGTKDLSAKQLQBMLGLSKVPVTQAT 
RGPQVQQPPPSNRFLQPVQKIDMNLTDIiLGELQRDPWPVPQGKR 
PLRS SGVALS I AVGLLE CT F PNTGAR I MMFIGG PATQG PGM WQ 
DELKTPIRSWHDIDKDNAKYVKKGTKHFEAIiANRAATTGHVIDI 
YACALDQTGLL EMKCC PNLTGGYMVMGDS FNTS LFKQTFQRVFT 
KDMHGQFKMGFGGTIjEIKTPR\EIKISGAIGPCVSLNSKGPCVS 
ENEIGTGGTCQWKIOGLSPTTTLAIYFEVVNQHNAPIPQGG\RG 
v l\jx 11ARN\WADAQTQIQNXAASFD 
QKAAAILMARLAIYRAETEEGPDVLRWLDRQLIRLCQKFGEYHK 
DDPSSFRFSETFSLYPQFMFHLRRSSFLQVFNNSPDESSYYRHH 
FMRQDLTQSLIMIQPILYAYSFSGPPEPVLLDSSSILADRIIjLM 
DTFFQ ILI YHGETIAQWRKSGYQDMPEYENFRHLLQAPVDDAQE 
ILHSRFPMPRYIDTEHGGSQARFLI^KVNPSQTHNNMYAWGQES 
GAPILTDDVSLQVFMDHLKKLAVSSAA 


6104 


124 


732 


KVSEYIILSKDKILFHAIiAMLVLW^PWSAARGVLRNYWERLLR 
JCLPQSRPGFPSPPWGPALAVQ\AQPCLQSQQMIPVEVKRI /RSL 
LDSIFWMAAPKNRRTIEVNRCRRRNPQKlilKVKNNIDVCPECGH 
LKQKHVLCAYCYE KVCKETAE XRRQ IGKQEGGP FKAPTIET WL 
YTGETPSEQDQGKRIIERDRKRPSWFTQN 


610S 


3 


989 


PLHGACTSI*VLQRFCHRRPRPCAPARPEDMRRPAAVPL1jLL1»CF 
GSQRAKAATACGR PRMLNR M VGGQDTQEGEWPWQVS IQRNGSHF 
CGGSLIAEQWVLTAAHCFRNTSETSLYQVLLGARQLVQPGPHAM 
YARVRQVESNPLYQGTASSADVALVELEAPVPFTNYIIiPVCLPD 
PSVTFETGMNCWVTGWGSPSEEDLLPEPRILQKLAVPIIDT\PR 
CNLLYSKDTEFGYQPKT1KNDMLCAGFEEGKKDACKGD9AGPIjV 
CLVGQSWLQAGVISWGEGCARQNRPGVYIRVTAHHNWIHRIIPK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A«Alanine, C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G«Glycine, 
HoHistidine, I=Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutaraine, R^Arginine, 
S»Serine, T=Threonine, Va Valine, 
WaTryptophan, Y*Tyrosine, X«Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 
1 LQVQPSEVGRPEVTPPGPGAP 


6106 


3 


1302 


GRPPTAPHTGRPPTANRGDPRLDLKRGC^LLTSIESRGRPAAS 
| AGLRRDR CALRRWPLRRAP LARATRRRAG SPRRCAPRPRACPQG 
! WSRARHQPGGLOLiIiLLLI^QFMEDRSAQAGNCWU?C2AKNGRC0V 
! LYKTELSKEECCSTGRLST5WTEEDVNDN7LFKWMI FNGGAPNC 
I P CKETCBNVO CGPGKKCRMNKKNKPRCVCAPDCSN I TWKG P VC 
GLDGKTYRNECALLKARCKEQPELEVQYQGRCKiCTCRDVFCPGS 
STCV\VDQTNNAYCVTCNRICPEPASSEQYLCGNDGVTY5\SAC 
HLRKATCLLGRS IGLAYEGKCI KAKSCEDIQCTGGKKCLWDFKV 
GRGRCSLCDELCPDSKSDEPVCASDNATYASBCAMKEAACSSGV 
LLEVKHSGSCNSISEDTEEEEEDEDQDYSFPISSILEW 


6107" 


623 


168 


SRCSS PRP SPGRGRGK/ LS PSEHRKWVEVFKACDEDHKG YIjSRE 
DFKTAVVMIjFGYXPSKIEVDSVMSSINPNTSGIIiI^GFLNIVRK 
KKEAQRYRNEVPJIIFTAFDTYYRGFLTLEDFKKAFRQVAPKLPE 
RTVLEVFREV\DRDS\DGHVSF 


6108 


3 


1348 


GGSLRFSPPRVPS.CSRVFCPVPPGGCGLPSPMSASfePQSPTTPTT" 

CLPRRYMKMK3U3DGPEKQEDE^VDVTPVMTCVFVVMCCSMLVLL 

YYFYT)LLVYWIGIFCLASATGLYSCIAPCVRRLP\SASAGESA 

LLAPT I PNNSL P YFHKRPQARMIJLLALFCVAVS VVWGVFRNEDQ 

WAWVLQDALGIAFCLYMLKTIRLPTFKACTLLLLVLFLYDIFFV 

FITP FliTKSGS S I MVEVATG PS DS ATREKLPMVLKVPRLNS S PL 

ALCDRPFSLLGFGDILVPGLLVAYCHRFDIQVQSSRVYFVACTI 

AYGVGLLVTFVALALMQRGQ PAIiLYLVPCTLVTS CAVALWRREL 

GVFWTGSGFAKVLPPSPWAPAPADGPQPPKDSATPLSPQPPSEE 

PATS PW PAEQS PKS RTS EEMGAGAPMREPGS PAES EGRDQAQ PS 

PVTQPGASA 


6109 


1 


1381 


CRSRAGAASGGAILEGTIO^RQRVDTITKPLDPLVPSALRAAMLY 
LEDYLEMI EQLPMDLRDRFTEMREMDLQVQNAMDQLEQRVSEFF 
MNAKKNKPEWRBEQMASI KKDYYKALEDADEKVQLANQI YDLVD 
RHLR KL DQELAKF KMELEADNAG I TE ILERRSLELDTPSQPVNN 
HHAHSHTPVEKRKYNPTSHHTTTDHIPEKKFKSEALUST^TSDA 
S KENTLGCRNNNSTASSNNAYNVNSSQPLGSYNIGSLSSGTGAG 
GI \TMAAAQAVQATAQMKEGRRTS SLKAS YEAFKNNDFQLGKEF 
SMARBTVGYSSSSALMTTLTQNASSSAADSRSGRKS KNNNKSSS 
QQSSSSSSSSSI>SSGSSSSTVVQEISQQTTWPESDSNSQVDWT 
YDPNBPRYCICNQVSYGEMVGCDTQDCPIEWFHYGCVGLTEAPK 
GKWYCPQCT\AAMKRRGSRHK 


6110 


77 


2464 


ACPSAATMSEX3DHSMDEMTAWKIEKGVGGNNGGNGNGGGAFSQ 
ARSSSTGSSSSTGGGGQESQPS PLAI.I AATCSR IBS PNENSNNS 
QGPSQSGGTGELDLTATQLSQGANGWQIISSSSGATPTSKEQSG 
SSTNGSNGSESSKNRTVSGGQYWAAAPNLQNQQVLTGLPGVMP 
NIQYQVI PQFQTVDGQQLQFAATGAQVQQDGSGQIQIIPOANGQ 
I ITNRGSGGNI lAAMPNLLO^AVPLQGIANNVLSGQTQYVTNVP 
VALNGNITLLPVNSVSAATLTPSSQAVTISSSGSQESGSQPVTS 
gttis saslvs sq as s s s fftnansystttttsnmg i mnfttsg 
SSGTNSQGQTPQRVSGIjOGSDALNIQQNOTSGGSLQAGQQKBGE 

q\nqqtqaapkslsrpqlvc<3g\qalq\afqaapi^g<3tf^rtqa 
isqetlqni^i^avpnsgpiiirtptvgpngqvswqtlqlqnlq 
vqnpqaqtitiiapmqgvslgqtsssnttltpiasaasrpagtvt 
vnaaqlssmpglqtiniisalgtsgio^hpiqglpiiaianapgdh 
gaqlglhgaggdgihddtaggebgenspdaqpqagrrtrreact 
cpyckdsegrgsgdpgkkkohichiggcgkvygktshlrahlrw 
htgerpfmctws ycgkrftrs dei/qrh krthtgekkfacpecpk 
r fmrsdh ls khi kthqnkkggpgvals vgtlpldsgags egsgt 
atpsalittnmvameaicpegiarlansginvkeggqfcspint 

SANGF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(AWUanine, C-Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F* Phenylalanine, G=Glycine, 
HoHistidine / I»Isoleucine, K=Lysine, 
L»Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
SoSerine, TsThreonine, V=Valine, 
WoTryptophan, YsTyrosine, X= Unknown, +-Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


6111 " 
" £112 


1637 
77 


797 
196 


j RVDPRVRQAMAPWGKRLAGVRGVLLDISGVLYDSGAGGGTAIAG 
SVEAVARLKRSRLKVRFCTNESQKSRAELVGQLQRLGFDISEQE 
VTAPAPAACQ ILKERGLR PYT1I1 1 H DGV\AS EFDQIDTS /STPNC 
VVIADAGESFSYQNKNNAFQVLMELEKPVLISLGKGRYYKETSG 
LMLBVGPYMKALEYACGIKAEVGGKPSPEFFKSALQAIGVEAHQ 
AVMIGDDIVGDVGGAQRCGMRALQVRTGKFRPSDEHHPEVKADG 
YVDNIiAEAVDLLLQHADK 

msshksfkskrflakxqkpnrpiijQwiwijKtgnkirhnwx j 


£113 


1779 


567 


WEGRSWAACGVNLQGAWGERSGVIUVSEL^PGKRADVSWWSRQL 
ET^HLANTEINSQRIAAVESCFGASGQPLALPGRVLLGEGVL 
TKECRKKAKPR I FFLFNDI LVYGS I VLNKR KYRS QHI I PLEEVT 
uiir c* a jjumvxv itw ilW^arv VoAAbATL RQEW I SHI E ECV 
RRQLRATGRPA\ STEHAAP W I PDKATDI CMRCTQTRFSALTRRH 

v umwomat JjljJr Jvuo lr tt.tr VKVCSLiCYR k J tAAQQR K 

EEAEEQGAGVPRAASHLARP ICGRPVEMTMTPTRTRRAAG7ATG 
PAAWSS TPRGWPGLPSTADPR PAEHLS PSQLHCPGPQEGSSRS C 

PGLRDPIPWKQVQRWGVALSGLPVPFCWTI^PYGFTAGNAFPFR 
KPQNTHRSW 


6114 


818 


24* " 


PTSRPRPSPGSPAMSWSACVSAAPSSSWPASSSWPCGPRRCCTR 
RRRCSPRCGLAAGSMCSCSPSWRCTPVPACWPSPPP\PABQVQC 
wmjr-riuuiRwujiuji'vfl/u'AKu roPCinPAGPAGPRPARTPPAS 0 
HQPGRPTVPAPPCPLLAATEPTPSRPHQRWTRBDRMLGRGSQVT 
GRPQWFLRGLVLFSL 


6115 


324 


71 


pyCGRVCAHPHL if TH 1 HMH I CAHAC \ I HTHAQ-Ldj / 1 TASHAl!iAH " 
SHLYTCMVMLTASHTPSHTHPHTAVHKEHRADVLRGTLTPLR 


6116 


595 


1430 


tg\^ppgrwhaa/isssgpvfegajuv\lottokeeboes"ytpvq 

AARPO^LNRPGQELFRQLFRQLRYHESSGPLETLSRLRELCRWtfl 
LRPDVLSKAQILELLVLEQFLSILPGELRW7VQLHNPESGBE\L 
WPCWRS CRGTLMGHPGGTRAL P \ EPRCALDGYRS \ LRSAQI WS L 
AS PLRSSS ALGDHLE P P YE I EARDFLAGQSDTPAAQMPALFPRE 

G CPGDO V 'I'PTP QT. T O T JT\ otmtt? V TM ro\ r*v c* o nrwrsn t ?▼ 
ww * w * / w v ±«o.u.i.auJjU*i Jro i.r xUJvIs VTFSQDBWGwI*DSAQRN 

LYRDVMLENYRNMASLGK 


6117 


1433 


222 


vgvpspappcswbvgpgggwtpgiu<egq^grrtplllLatrtr 

GLLSLFPPAAMHPAAFF LP VWAAVLWGAAPTRGL I RATSDHNA 
SMDFADLPALFGATLSQEG LQGFLVEAHPDKACS P IAP ? PEAPV 

NG svf iallrrfdcnfdlkvlnaqkag ygaavvhnvnsnellnm 

VWNSEBIQQQIWI PSVFIGERSSEYLFJUjFVYEKGARVLIjVPDN 
TFPIK3YYLlPFTGIVGLLVIiAMGAVMTappTnwDVT5T rvovior w 
\EQIiKQI \ PTTOYQKGDQYDVCAICIiDBYBDGDKLRVLPCAHAY 
HSRCVDPWLTQTRFCTCPICKQPVHRGPGDEDQEEETQGQBEGDB 

GEPRDHPASERTPLLGSSPTLPTSFGSLAPAPLVFPGPSTDPPL 
SPPSSPVILV 


611B 


1044 


247 


stisc^ctsgatpgaqshrsarghaaggketaai^mergkvkk 

KEKEKETQKEKIGEKGREEKVKRKEVEQKI KQEKQEKQERRKGK 

ekeekrtkqgketnkekeqfkgqeekgenkdstltrtpleplek 
nkqilvlgux3agktsvuisiasnrvqhsvaptox3fkavci>rre 

DSO^FI^IGGSKPFP^YWEMYLSN/ADSLARSFSVGPKQDSQP 
^WKAKKYUIQLIAANPVLPLVVFANKO^LEAAYHITDIHEA^ 


6119 


1217 


462 


DPR FV iisw 1'TKAFAQERTTQPRSS RJEGTLRSTME YLS ALN PS DL ' ~ 
LRS VS N IS S E FGRR VW TS AP P PQRP FR VCDHKRTI R KG LTAATR 
iiELLAKALETLLLNGVLTLVUSEDGTAVDS 
VLQSGQSWSP7RSGVLSYGLGRERPKHSKDIARFTFDVYKQNPR 

dlfgslnvkatfygxysmsodfqglXgpkkvlrellrwtstllq 

3LGHMLLG I SSTLRHAVEGAEQWQQKGRLHS Y 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anuno acid segment containing signal peptide 
(A=»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H»Histidine, I=:Isoleucine , KsLyeine, 
Leucine, M=Methionine, N»Aeparagine, 
P^Proline, Q=Glutaraine, R»Arginine, 
S=Serine, T=Threonine, V= Valine, 
W-Tryptophan, Y«Tyrosine, X-UnJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 


6120 


785 


179 


LERAGGGGIiSSRAIiVGSGACLSLVARANGKGLPRGRKEFVBAVR 
VR YVAFRYRTPRAVCLRLWS CRRE VI MS GRGKQGGKVRAKAKS R 
SSRAGLQFPVGRVHRLLIIKGNYAERVGAGAPVYLAAVLEYLTAB 
I L ELAGNAARDN KKTR 1 1 PRHLQLAI RNDBELNKLLGKVT I AQG 
G\VLPNIQAVLLPKKTESQKDEGANDP 


6121 


1612 


107 


F VRAOARGSRO P VRRP LLdQAQS RL.RCR *3 GGRM PP T « w P \c j?a tum 
RGNGLRAVTPLRPGKLLFRSDPLArrVCiCGSRGVVCDRCLLGKE 
KLMRCSQCRVAKYCSAKCQKKAWPDHKRECKCLKSCKPRYPPDS 
VRLLGRWFKLMDGAPSESEKLYSFYDLESNINKLTEDKKEGIiR 
QLVMTFQHFMREBIQDASQLPPAFDLFKAFAKVICNSFTICNAE 
MQEVGVGLYPS ISLLNHSCDPNCS I VFNGPHL LLRAVRD IEVGE 
BLT I CYLDMLMTSE ERRKQLRDQ YCFECD \ CFRCQTQD KDADML 
TGDEQVWKEVQESLKKIEELECAHWKWEQVIAMOQAIISSNSERL 
PDIKI YQLKVLDCAMDACINLGIiIjEBAIiFYGTRTMBPYRI ffpg 
SHP VRGVQVMKVGKLQLHQGMFPQ AMKNLRLAFD I MRVTHGREH 
SLIEDLILLLE/AMRRQHQSILRERSQREIRRVSLLNALLRSHT 
LCFVS CVNLS YWKFGSVFV 


6122 


2 


2324 


RFRKMADGGAASQDESSAAAAAAADSRMNNPSETSKPSMESGDG 

ntgtqtngldpqkqpvpvgqaistaqaoaflghlhovqlagtsl 
qaaaq3lnvqsksneesgdsqqps qpsqqpsvqaai pqtqlmla 
ggq i tgltlt p aqqq lllqq aqaqaqiilaaavqqhs as q q hs aa 
gati s asaatpmtq i plsqp i ql aqdlqqlqqlqqqnlnlqqfv 

LVHPTTNLQPA\QFIISQTPQGQQGLLQA\QNLLTQLPRQSQAN 

IjIjOS O PR T \ TI/T CfTDiTDTnTTH fi*PD T fVFT.Drt PrtCfT5tn> t ninn e» 
\ a i-ti ayrhlr l \+± iiuti r* J. >JiLur yoUo XxrlUi JLulfo 

LEEP\SDLEELEQFAKTFKQRRIKLGFT\QGDAGLAMVKLYGND 
FSPTTIFRFEALNLSFKNMCKX.KPIiIiEKWI^nAENLSSDSSLSS 
PS ALN S PG I EGLS RRRKKRTS IEA\ N I RVALEKSFLEN\ QJCPTS 
EEITMIADQLNMEKGVIRVWFCNRRQKEKRIKPPSSGG\TSSSP 
I KAI F PS PTSLVATTPSLVTSS AATTLTVSP VLPI/TSAAVTNLS 
VTGTSDTTSNNTATVISTAPPASSAVTSPSLSPSPSASASTSEA 
SSASETSTTOTl-STPIiSSPLGTSQVMVTASGLQTA/AQLLPFKG 
AAQLPANASIAAMAAAAGLNPSLMAPSQFAAGGALLSLNPGTLS 
GALSPALMSNSTIATIQALASGGSLPITSLDATGNLVFANAGGA 
PNIVTAPLFLNPQNLSLLTSNP VS LVS AAAAS AGNSAP VAS LHA 
TSTSAES IQNSLFTVASASGAASTTTTASKAQ 


6123 


3 


2944 


HIiLHRWFGTDMQMINFTTGEFQIjTEACPYIiGTHSEESRFGILHL 
HLQPLEMKRVGWFTPADYGKVTSLILIRNNLTVIDMIGVEGFG 
ARELLKVGGRiPGAGGSLRFKVPESTLMDCRRQLKDSKQILSlT 
KNFKVENIGPLPITVSSLKINGYNCQGYGFEVJjDCHQFSLDPNT 
SRDIS IVFTPDFTSSWVIRDLSLVTAADLEFRFTLNVTIiPHHLL 
PLCADVVPGPSWEESFWRLTVFFVSlfSIjLGVIIilAFQQAQYILM 
BFMKTRQRQNASSSSQQNNGPMDVISPHSYKSNCKNFLDTYQPS 
DKGRGKNCl* P VNTPQSRIQNAAKRS PATYGHSQKKHKCS VYYSK 
HKTSTAAASSTSTTTEEKQTSPLGSSLPAAKEDICTDAMRENWI 
SLRYASGINVNLQKNLTLPKNLLNKEENTLKNTIVFSHPSSECS 
NKEGIQTCMFPKETDIKTSBOTAEFKERELCPLKTSKKLPENHL 

prnspqyhqpdlpeisrkkngnn^vpvknevdhcenlkkvdtk 
pssekkihktsredmfsekqdipfveqedpyrkkklqekregnl 
qnlnwsksrtcrknkkrgvapvsrppeqsdiiklvcsdfersels 
sdinvrswciqestrevckadaeiasslpaaqreaegyyqkpek 
kcvdkfcsdsssdcgsssgsvrasrgswgswsstsssdgdkkpm 
vdaqhflpagdsvsqndfpseapislnlshnicnpmtgnslpqy 
aepscpslpagptgveedkglyspgdlwptppvcvtsslnctle 
ng vpcvi qes ap vhns fi dws atcegqfs s aycplblnd ynafp 
eenmnyangfpcpadvqtdfidhnsqstwntpp\nmpas\wqna 
qfpsssrpylkstpkaclpmsglfgpi\wap\qsdvyenccpin 
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" SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

□UCXeOt JLUc 

location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Aianine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, TsThreonine , V=Valine, 
W=Tryptpphan, Y=Tyrosine, X*Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








PTYEHSD/ THMKNQA *VWCKB YY PG F \NPPRAYMNI<DI WTTT \ A~~ 
NRNANFPLSRDSSYCGNV 


6124 




236 


SPEAXjRLAGERGMGRVQLFE I S LsitoRVVYSPGBPLAGT VRVRIj 1 

GAPLPFRAIRVTCIGSCGVSNKANDTAWVVEEGYFNSSLSLADK 

GSLPAGEHSFPFQFLLPATAPTSFEGPFGFCEVHQVRAAIHTPRF 

S KDHKCS LVFY I LS PLNLNS I PD IBQ PNVAS ATKX FS YKL VKTG 

SWLTASTDLRGYWGQALQLHADVENQSGKDTSPWASIiLQKV 

SYKAKRWIHDVRTIAEVEGAGVKAWRRAQWHEQILVPALPQSAL 

PGCSI/IHIDYYLQVSLKAPEATVTLPVFIGNIAV/NPCPSEPPA 

RPGAAS WGPTPGG\ PSAPPQEEABAEAAAGGPHFLDPVFLSTKS 

HSQRQPLLATLSSVPGAPEPCPQDGSPASHPLHPPLCISTGATV 

PYFAEGSGGPVPTTSTLILPPEYSSWGYPYEAPPSYEQSCGGVE 
PSLTPES 


6125 


1 


904 


ktcpkltcafxvsvpdsccrVcrgdgelswehsdgdifrqpanrH 

EARHSYHRSHYDPPPSRQAGG3LSRFPGARSHRGALMDSQQASGT 

ivqivinnkhkhgqvcvsngktyshgbswhpnlrafgivecvlc 

TCNVTKQE CKXIHC PNRYPCKYPQKIDG KCCKVCPG / KKAKEE L 

pgqsfdnkgyfcgbetmpvyesvfmedgettrkialeterppqv 

EVHVWTIRKGIIjQHFHIEKISKRMFEELPHFKLVTRTTLSQWKI 

ftegeaqisqmcssrvcrteledlvkvlylersekghc 1 


6126 


1224 


389 


ri^seapcprsrrrfqmnpewgqafvhvavagglcavavfWifH 

DSVSVQVGYEHYAEAPVAGLPAFliAMPKNSLVNMAYTLLGLSWL 

hrggamglgprylkdvfaamallygpvqwlrlwtqwrraavldo 

WLTLPIFAWP VAWCLYLDRGWRP \ WLFLSLECVSLAS YGIiALLH 
PQGFEVALGAHWPAVGQALRT\HRHYG/SATPSATYIALGVLS 
CLGF WLKLCDHQIARWRL FQCLTGHFWS KVCDVLQFHFAPLFI* 
THFNTHPRFHPSGGKTR 


6127 


1335 


463 


VLPRRCLVFVVNTMDSSREPTLGRLDAAGFWQVWQRFDADEKBY 1 

IEEKELDAFFWIMtiMKIKJTDDTVMKANI^KVKCM 

GRIRMKEIiAGMFLSEDENFLLIiFRRENPLDSSVEFMQIWRICYDA 

DSSGFISAAELRNFIrRDLFLHHKKAJSEAKLEEYTGTMMKIFDR 

NKDGRLDLNDIJu^IIJujQENFLLQFICMDACSTEKRKGDFEKIFA 

YYDVSKTGALEGPXEVDGFVKDMMELVQPSISGVDtiDKFREILIi 

RHCDVNKDGKIQKSEIALCLGLKINP j 


6128 


2S11 


843 


TGRMSRRQLKRWVWSSQQVQARGRNVRAPRLGKIAMGIiEMS^D I 
S PGSLDGRAWEDAQKPQS AWCGGRKTRVYATSS RRAPPS EGTRR 
GGAAR PE KTAEEGPPAAPGS LRHSGPLGPHACPTAL PEPQVTS A 
MSSQWG IBPLYI KAEPAS PDSPKGSSETETEPPVAtiAPG \ PAP 
TRCLPGHKEEEDGEGAGPGEQGGGKLVLSSLPKRLCLVCGDVAS 
GYHYGVASCEACKAFFKRTIQGSIEYSCPASNECEITKRRRKAC 
QACRFTKCLRVGMLKEGVRLDRVRGGRQKYKRRPEVDPLPFPGP 
FPAGPIAVAGGPRKTAAPVNALVSHLLWEPEKLYAMPDPAGPD 
GHLPAVATUH3LFDREIWTIS WAKS IPGFSSLSLSDQMS VLQS 
VWMEVLVLGVAQRSLTLQDELAFAEYLVLDEEGARPAGLGELG\ 
AALLQLVRRLQALRLEREKYVLLKALALANSDSVHIEDEPRLWS 
S C£ KLLH EALLE YEAGRAGPGGGAE RRRAGRLLLTL PLLRQTAG 
KVLAHFYGVKLEGKVPMHKLFLEMLEAMMD 


6129 " 
6130 


1764 
3 


771 
577 


ARFARSAHEGKMPKKKTGARKKAENRRERBKQLRASRSTIDLAlTj 
HPCNASMECDKCQRRQKNRAFCYFCNSVQKL PICAQCGKTKCMM 
KSSDCVIKHAGVYSTGLAMVGAICDFCEAWVCHGRKCLSTHACA 
CPLTDAEC\VECERGVWDHGGRIFSCSFCHNFLCEDDQFBHQAS 
CQVLEAETFKCVSCNRLGQHS CLRCKACFCDDHTRS KVFKQEKG 
KQPPCPKCGHETQETKDLSMSTRSLKFGRQTGGEBGDGASGYDA 
SWKNLSSDKYGDTSYHDEEEDEYEAEDDEEEEDEGRKDSDTESS 
DLFTNLNLGRTYASGYAHYEEQBN | 
3RGGTMRE YKVWLGSG \ G VGKS ALTV\Q FVTCTF I E K YD PT I E | 
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SEQ 
ID 

NO: 


1 Predicted ~~ 
beginning 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(JUAlanine, C=Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, ^Phenylalanine, GoGlycine, 
H»Histidine, I*=Isoleucine, K=Lysine, 
L=Leucine, M°Methionine, NisAsparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) | 








DFYRKE I EV\DSS PSVAQISWTQQGTEQF VASMRDL YIKKGQGC 
HiVYSLVNQQSFQ\DIKPMRDQIIRVKVSEKVPVI\LVGN\SVD 
LESEREVSSSEGRALAEEWGCPFMETSAKSKTMVDELFAEIVRQ 
MNYAAQPDKDDPCCSACNIQ 


6131 


3 


1811 


SSPREKTSDSSHRPSRHGFLPLRLVGLSPFSYLCVPPSRPVPG5 
PRS LSAMRLLPLAPGRLRRGS PRHLPSCS PALLLLVLGGCLGVF 
GVAAGTRRPNWIilJjrDDQDEVLGGMTPLKKTKALIGEMGMTFS 
SAYVPSALCCPSRA5ILTGKYPHNHHWNNTLEGNCSSKSWQKI 
QE PNTFPAI LRSMCG YQTFF\AGKYLNBYGAPnAGGLEHVPU3W 
S YWYALBKITS KYYNYTLS INGKAR KHGENYSVD YLTDVLANVSL 
D FLD YKSNFE PFFMMTATP\ APHS PWTAAPQ YQKAFQNVFAPRN 
ra/FMIHGTMKHWIrlRQAKTPMTNSSIQFLDNAFRKRWQTLLSVD 
DLVEKLVKRLEFTGELNNTYIFYTSDNGYHTGQFSLPIDKRQLY 
EFDIKVPLLVRGPGlKPNQTSKMLVANIDLGPTILDIAGYDIiNK 
TQMDGMSLLPILRGASNLTWRS0VLVEYQGEGRNVTDPTCPSLS 
PGVSQCFPDCVCEDAYNNTYACVRTMSALMNLQYCEFDDQEVFV 
BVYNLTADPDQITNIAKTIDPEIiIiG KMNYRLMMLQSCSGPTCRT 
PGVFDPGYRFDPRLMFSNRGSVRTRRFSKHLL 1 


6132 


96 


1241 


AAGLL PPGLVPEDPRRTRNLLP FG I QGPP FALS RPLFS CVESGW 
AWEAMEPEFLYDLLQLPKGVBPPAEEELSKGGKKKYIiPPTSRKD 
PKFEELQKJA\VLMEW INATLLPEHI VVRSLEEDMFDGLI LHHL 
FQRLAALKLEAEDIALTATSQKHKLTVVLEAVNRS\CSWRSGRP 
SG A/ WES I FNKDLLSTLHLLVALAKRFQPDLSL PTNVQVE VITI 
ES TKS GLKSE KLVEQLTE YS TDKDE P PKD VFDELFKLAPE KVNA 
VKEAI VNFVKQKLDR LGLS VQNLDTQFADGVI LLLLIGQLEGFF 
LHLKEFYLTPNS PAEMLHNVTLALELL/ IGRGPAQLPC /LALK/ 
TI VNKDAKSTLR VLYGL FCKHTQKAHRDRTPHGAPN 


6133 


2 


423S 

T 


FVHGSMADTDLFMECEEEELEPWQKISDVIEDSWEDYNSVDKT 
TTVS VS QQ PV3AP VP IAAHAS VAGHLSTSTTVSS SGAQNSDSTK 
KTLVTLIANNNAGNPLVQQGGQPLILTONPAPGLGTMVTQPVLR 
PVQVMQNANHVTSSP VASQ PI FITTQGFPVRNVRPVQNAMNQVG 
IVLNVQQGQTVRPITLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 
STMP VRPTTNTFITVI PATLTI RSTVPQSQSQQTKSTPSTS TTP 
TATQPTS LGQLAVQS PGQSNQTTNPKLAPS FPS PPAVS IAS FVT 
VKRPGVTGENSNEVAiOLVNTIiNTI PS LGQS PGPVWSNNS SAH\ 
GSQRTS G PESSMKVTSS I PVFDLQDGGRKICPRCNAQFRVTEAIi 

rghmcyccpemveyqkkgksldsepsvpsaakpps pektapvas 
/thpsstpipai*sppy/tkvpepnenvgdavqtklimlvddfyy 

GRDGGKVAQLTNF PKVATS FRCPHCTKRLKNNIRFMNHMKHHVE 
LDQQNG EVDGHT I CQHCYRQFST P FQLQCHLENVHS P YESTTKC 
KICEWAFESEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSEVD 
VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 
CNXCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 
IQKRAVRKMS VMGRQTCLECS FEI PDFPUHFPTYVHCSLCR YST 
uuo fvM x /\rmro x w w n V r Kiva PiuluAjuFKNS VSG I KLACTSCTFVT 
SVGDAMAKHLVFNPSHRSSSILPRGLTWIAHSRHGQTRDRVHDR 
WVKNMYP P PS FPTNKAATVKS AGATPAE PEELLTPLAPALPS PA 
STATPPMPTOPQAIJUjPPLATEGAECIJJVDDQDEGSPVTQEPE 
LASGGGGSGGVGKKEQLSVKKLRWIiFALCCNTEQAAEHFRNPQ 
RRIRRWLRRFQASQGENLEGKYLSFEAEBKIiAEWVLTQREQQLP 

vnebtlfqkatkigrsleggfkisyewavrfmlrhhltpharra 

VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFL 

dtevlssddrkenalqtvgtge pwcdwlailadgtvlptlvfy 
rgqmdqpanmpdsilleakesgysddeimelwstrvwqkhtacq 
rskgmlvmdchrthlsebvlamlsasstlpavvpagcsskiqpl 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

corrRBOondincf 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A«Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G«Glycine, 
n-nisciuine , J.»xsojLeucine, K=Lyslne, 
L»Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S^Serine, T»Threonine, V« Valine, 
WoTryptophan, Y= Tyrosine, X= Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\apoosible nucleotide insertion) 








DVCIKRTVKNPLHKKWKEQAREMADTACDSDVXiLQIjVLVWLGBV 
LGVIGDCPELVQRSFIjVASVLPGPDGNINSPTRNADMQEELIAS 
LEEQLKLSGEHSESSTPRPRSSPEETIBPBSLHQLPEGESETES 
PYGFEEADLDLMEI 


6134 


2 


42Sg 


FVHGSMADTDLFMBCEEEELEPWQKISDVIEDSVVEDYNSVDKT 
TTVSVSQQP VS APVP IAArtASVAGHLSTSTTVS SSGAQNSDSTK 
KTLVTLIANNNAGNPLVQQGGQPL I LTQNPAPGLGTMVTQ PVLR 
PVQVMQNANHVTSS PVASQPI F ITTQGFPVRNVRPVQNAMNQVG 
IVIjNVQQGQTVKPITLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 
STMPVRPTTNTFTTVI PATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQPTSI^LAVQSPGQSNQTTNPKLAPSFPSPPAVSIASPVT 
VKRPGVTGENSNEVAKLVNTLNTI PSLGQSPGPWVSNNSSAH\ 
GSQRTSGPESSMKVTSS I P VFDLQDGGRKI CPRCNAQFRVTEAI* 
RGHMCYCCPEMVEYQKKGKSLDSBPSVPSAAKPPSPEKTAPVAS 
/THPSSTPIPALSPPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 
GRIX5GKVAQLTNFPKVATSFRCPHCTKRLKNNIRFMNHMKHHVE 
LDQQNGE VBGHTI CQHCY RQ FSTPFQLQCHLBNVHS P YES TTKC 
KICEWAPESEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSEVD 
VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 
IQKRAVRKMS VMGRQTCLE CS FEIPDFPNHFPTYVHCSLCRYST 
CCS RAYANHM INNHVPRKS PfCYLAIiF KNSVSG I KLACTSCTFVT 
S VGDAMAKHLVFNP SHRS SSI LPRGLTWIAHS RHGQTRDRVRDR 
NVKNMYP PPS FPTNKAATVKSAGATPAE PEELLTPLAPALPS PA 
STATPPPTPTHPQALALPPLATEGAE CLNVDDQDBGSP VTQEPE 
LASGGGGSGG VGKKEQLS VKKLRWLFALCCNTEQAAEH FRNPQ 
RRIRRWLRRFQASQGENLEGKYLSFBAEEKLAEWVLTQREQQLP 
VNEETLFQKATK IGRSLEGG FKIS YEWAVRFML RHHLTPHARRA 
VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFL 
IJTEVLS SDDRKENALQTVGTGE P WCDVVliAI LADGTVLPTLVFY 
RGQMDQPANMPDS ILLEAKESGYSDDEIMELWSTRVWQKHTACQ 
RS KGMLVMDCHRTHLS EEVLAMLS AS STLPAWPAGCSSKIQPL 
DVClKRT\nCNFLHKKWKEQAREMADTACDSDVLLQIiVLVWLGEV 
LGVIGDCPELVQRSFLVASVLPGPDGNINSPTRWADMQEELIAS 
LEEQLKLSGEHSESSTPRPRS S PEET I E PESLHQLFEGESETES 
PYGFEEADLDLMEI 


613S 


2 


4256 


FVHGSMADTDLFMECEEEELEPWQKISDVIEDSWEDYNSVDKT 
TTVSVSQQPVSAPVPIAAHASVAGHLSTSTTVS SSGAQNSDSTK 
KTLVTLIANNNAGNPLVQQGGQ PL I LTQN PAPGLGTMVTQP VLR 
PVQVMQNANHVTSSPVASQPIFITTQGFPVRNVRPVQNAMNQVG 
IVLNVQQGQTVRPITLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 
STMP VRPTTNTFTT VI PATLTIRS TVPQSQSQQTKSTPS TSTTP 
TATQPTSLGQLAVQSPGQSNQTTNPKLAPS FPS PPAVS IAS FVT 
VKRPGVTGENS NB VAKLVNTLNTI P SLGQS PG PVWSNNSSAH \ 
GSORTSGPES S MKVTS S I P VPDT.OIV3fTO Y T CVP CWhr.wx> utp -a t 
RGHMCYCCPEMVEYQKKGKS LDS EPS VPS AAKP PS PEKTAPVAS 
/THPSSTPIPALSPPY/TKVPEPMENVGDAVQTKLIMLVDDFYY 
GPJ3GGKVAQLTNFPKVATSFRCPHCTKRLKNNIRFWNHMKHHVE 
LDQQNGEVDGHTICQHCYRQFSTPFQLQCHLENVHSPYESTTKC 
KI CEWAFESE PLFLQHMKDTHKPGEMPYVCQVCQ YRS SL Y3 EVD 
VHFRM IHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQIiEGLKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 
IQKRAVRKMSVMGRQTCLECSFEIPDFPNHFPrYVHCSLCRYST 
CCSRAYANHMINNHVPRKSPKYIiALFKNSVSGI KLACTSCTFVT 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C*Cysteine, D-Aspartic Acid, e= 
Glutamic Acid, F- Phenylalanine, G-Glycine, 
H-Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NaAsparagine, 
P»Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T»Threonine, V= Valine, 
W=Tryptophan, Y*Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 








SVGDAMAKHLVFNPSHRSSSIIiPRGLTHIAHSRHGQTRDRVHDR 
NVKNMYPPPSFPTNKAArVKSAGATPAEPEELLTPLAPALPS PA 
STATPPPTPTHPOALALPPLATEGAECLNVDDQDEGS PVTQBPE 
IiASGGGGSGGVGKKEQLSVKKLRVVLPALCCNTEQAAEHPRNPQ 
RRIRRWLRRFQASQGENLEGKYIiSFKAEBiCLAEWVLTQREQQLP 
VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 
VAHTLPKDVAENAGLF IDFVQRQI HNQDLPLSMIVAI DE I SLFL 
DTEVLSS DDRKENALQTVGTGEPWCDWLAI LADGTVL PTL VFY 
RG QMDQPANM PDS I LLEAKESG Y5 DDE I MEL WS TRVWQKHTAOQ 
R3KGMLVMDCHRTHLSEEVLAMLSASSTLPAVVPAGCSSKIQPL 
DVCI KRTVKNFIiHKKMKEQARKMADTACDSDVLLQLVLVWLGBV 
LGVIGDCPELVQRSFLVASVLPGPDGNINSPTRNADMQEELIAS 
LEEQLKLSGEHSESSTPRPRSSPEETIEPBSLHQLFEGESETES 
FYGFEEADLDLMEI 


6136 


1704 


539 


FG VRMALEGMS KR KRKRS VQEGEN PDDG VRGS P PEDYRLGQ VA5 
SLFRGEHHSRGGTGRLAS LFSSIiEPQIQPVYVPVPK\BSAIiASA 
DLEEE IHQKQGQKRKNS Q PGVKVADRKILDDTEDTVVSQR3CKIQ 
INQEEERLKNERTVFVGNLPVTCN KKKLKS FFKEYGQIESVRFR 
SLI PAEGTLSKKLAAI KRKIHPDQKNINAYWFKEESAATQALK 
RNGAQIADGFRIRVDLASETSSRDKRSVFVGNLPYKVEESAIEK 
HFU)CGSIMAVRIVRDKMTGIGKGFGYVLFBNTDSVHLAIjKLNN 
SE LMGRKLRVMRB VNKEKFKQQNSNPRLKNVS KP KQGLNFTS KT 
AEGHPKSLFIGEKAVLLKTKKKGQKXSGRPKKQRXQK 


6137 


xn 


2656 


RALRKRRCGPGRRGALGSGPGPQRRPGRVPEERPAPPRERKHPG " 
MWNML I VAMCLA\ LLGLPGKAQELQGHVS \ 1 1 LAG EQLGDLAK3C 
YLWQG \LFQLYLDEAGRGHS PS FHGAALTAPKQGQELMAKALES 
LSCP KDMAPSHCAEHKDQFLQLS QYRQLKTAED YQALNKD 1 EAQ 
LQHAGLREAGG IFYFSVPPFAYEDIARNINSSCRPGPGAWLRW 
LEKP FGHDHFSAQQLATELGTFFQBEEMYRVDHYLG KQAVAQIL 
PFRDQNRKALDGLWNRHHVERVEI IMKETVDAEGRTSFYEE YGV 
I RDVLQNHLTEVLTLVAME L PHNVSS AEAVLRHKLQVFQALRGL 
QRGSAVVGQYQS YSEQVRRELQKPDSFHSLTPTFAGVLVH IDNL 
RWEG V P F I LKSGKALDER VG YARILFKN QACCVQ S E KHWAAAQS 
QCLPRQLVFHI GHGDLGS PAVLVS RNL FRP SLPS S W KEMEG P PG 
LRLFGS PLSDYYAYS PVRERDAHS VLL SH I FHGRKWFFITTENL 
lASWNFWTPLLESLAHKAPRLYPGGAENGRLLDFEFSSGRLFFS 
QQQPBQLVPGPGPGPMPSDFQVLRAKYRESSLVSAWSEELISKL 
ANDIBATAVRAVRRFGQFHLALSGGSSPVALFQQLATAHYGFPW 
AHTHLWLVDBRCVPL3DPESNFQGLQAHLLQHVRIPYYNIH\AI4 
PVHLQQRLCAEEDQG AHI YARE I SALGANS S FDLVLLGMGADGH 
TASLFPQS PTGLDGEQLWLTTSPSQPHRRMSLSLPLINRAKKV 
AVLVMGRMKRE I TTL VSR VGHE PKKWP ISGVLPHSGQLVWYMDY 
DAFLG 


6138 


45B7 


934 


EF S KLTDR WQNAVQG VRQRKGD VDGL VRQWQDFTTS VENLFRFL 
TDTSHLLSAVKGQERFSLYQTRSLIHELKNXEIHFQRRRTTCAL 
TLEAGEIOjIiLTTDLKTKES vgrri SQLQDS wkdmepqlaemi kq 
FQSTVETWDQCEKKIKELKSRLOVLKAOSEDPLPELHEDLHNEK 
ELIKELEQSI^WTQNLKEIX)TMKADLTRHVLVEI)VMVLKEQIE 
HLHRQWEDLCLR VA I RKQEI BDRLNTWWFNEKNKBLCAWLVQM 
ENKVLQTADISIBEMIEKLQKDCMEBINLFSENiCLQLKQMGDQL 
IKASNKSRAAEIDDKLNKINDRWQHLFDVIGSRVKKLKETPAFI 
QQloDKNMSNLRTWLARrESELSKPVVYDVCDDQEIQKRLAEOOD 
LQRDIEQHSAGVES VFNICDVLLHDSDACANETECDS IQQTTRS 
LDRRWRNI CAMSMERRMKI EETWRLWQKFLDDYS RFED W LKSAB 
RTAACPN5S EVLYTS AKE ELKR FEAFQRQIHERLTQLEL I NKQY 
RRLARENRTDTASRL KQMVHEGNQRWDNLQRRVTAVLRRLRHFT 
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SSQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anu.no acid segment containing signal peptide 
(AoAlanine, CnCysteine, D=Aspartic Acid, E=» 
Glutamic Acid, ^Phenylalanine, G=Glycirie, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W*Tryptophan, Y=Tyrosine, XoUnknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 


• 






NQREE FEGTRES I L VWLTEMDLQLTNVEHFS SSD ADD KMRQLNG 
FQQEITI,NTNKIDQIjIVFGEQLIQKSEP\LDAVLIEDELEELHR 
YCQEVFGRVSRFHRRLTSCTPGLEDEKEASENETDMED PRE IQT 
DSWRKRGESEEPSSPQSLCHLVAPGHERSGCETPVS VDS\ IPLE 
WDHTGRRGGPSSSH\EEDEEAQYY\SALSGKSISDGHSWHVPDS 
PSCPEHHYKQMEGDRNVPP VP PAS STPYKP P YG KLLL P PGTDGG 
KEGPRVLNGNPQQBDGGLAGITEQQSGAFDRWEMIQAQEL\HNK 
LKIKQNLQQLNSDISAITTWLKKTEABLBMLKMAKPPSDIOEIE 
L R VKRLQ E I LKAFDT YKAL WS VNVSS KE FLQTES PES TELQSR 
LRQXjS LLWEAAQGAVD s wrgglrqs lmq cqdfhqls qnlllwla 

S AKNRRQ KAHVTDP KAD PRALLECRR ELMQLE KE LVERQPQVDM 
LQEISNSLLIKGHGEDCIEAEEKVHVI\EKKLKQLREQVSQDLM 
ALQGTQNPASPLPS FDEVDSGDQPPATSVPAPRAKQFRAVRTTE 
GEEETESRVTGSTRPQRSPLSRVVRAALPIiQLLLLLLLLtiACLL 
PSSEEDYSCTQANNF\ARSFYPMLRYTNGPPPT 


6139 


52 


1131 


LGDW VWSRTCG VLET PTS VLRRARARG PCPTDS KWAL P RL REGE 
TBRRPWEASSWKTL/IJ^WIGGAASVIVGHPLDTVKTRLQAGVG 
YGNTLSCIRWYRRESMFGFFKGMSFPLASIAVYNSWFGVFSN 
TQRFLS QHRCGE PEAS PPRT LS DLLLAS MVAG WS VGLGG PVDL 
I KI RLQMQTPP VSGRQPRFEVQGSGSCG \EPAYQGPVHCITTIV 
RNEGLAGLYRGASAMLLRDVPGYCLYFI PYVFLSEW ITPEACTG 
PS PCAVWLAGGMAGAI S WGTATPMDWKSRLQADGVYLNKYKGV 
LDC I SQS YQKEGLKVFFRGI TVNAVRGFPMSAAMFLGY ELSLQA 
IRGDHAVTSP 


6140 


694 


136 


RPELELWRIJ^RSWRPLGVPRRCHRRNWKBPVRAQPLSVTVWAP 
RCQRP/QPPAPEPSSPNAAVPEAIPTPRAAASAALELPLGPAPV 
SVAPQAEAEARSTPGPAGS RLG PETFRQRFRQFRYQDAAGPREA 
FRQLREL/SPRQWLRPDI\RTKBQ\IVEMLVQEQLIAILPEAAR 
ARRIRRRTDVRITG 


6141 


2 


984 

r 


AQVGPRSRP CKM PLKLRGKKKAKS KETAGLVEGE PTGAGGGS LS 
ASRAPARRL VFHAQLAHG SAXX3R VEG FS S I QE L YAQ IAGA FE IS 
PSEILYCTTjNTPKIDMBRLLGGQLGLEDFIFAHVKGIEKEVNVY 
KSEDSLGLTITDNGVGYAFIKRIKDGGVIDSVKTICVGDHIESI 
NGENIVGWRHTDVAKKLKELKKEELFTMKLIEPKKAFEIELRSK 
AGKS SGEKIGCGRATLRLRS KGPATVEEMPSETKAK\ AI E KIDD 
W^LYMGIRDIDIATTMFEAGKDKVNPDEFAVALDETLGDFAFP 
DEFVFDVWGVIGDAKRRGL 


6142 


116 


602 


EAEGEQVGGAKCCGDAPHVENRBBETARIGPGVMESKEERALNN 
L I VENVNQENDE KD EKEQVANKGE PLALPLNVS E Y CVPRGNRRR 
FRVRQ P I LQYRWDI MHRLGE PQARMR EENMERIGE E VRQLME KL 
RE KQLSHSLRAVSTDP PHHDHHDE PC \ LM P 


614* 


2802 


210 


FRMR I FLHCP WNQQMWKI WNLLETSLES CKAHLS I QKIiLKER \Q 
\QLPVFKHRDS I VETLKRHRVWVAGET\GSGKSTQVPHFLLED 
LLLNEWEASKCNIVCTQPRRISAVSLANRVCDBLGCENGPGGRN 
S LCG YQ IRMESRACESTRLL YC1TG VLL RK LQEDGLLSNVS /HM 
FI VDEV\HER \ S VQSDFLLI ILKEI LQKRSDLHLILM S ATVDS E 
KFST YFTHCPILRI SGRS YP VE VFHLED 1 1 EETG FVLE KDSE YC 
QKFLEEBEEVTINVTSKAGGIKKYQEYI PVQTGAHADLNPFYQK 
YSSRTQHAILYMNPHKINLDLILELLAYLDKSPQFRNIEGAVLI 
FLPGIiAHIQQLYDLLSNDRRFYS ERY KV I ALHS ILS TQDQAAAF 
TLPPPGVRKIVLATNIAETG ITIPDVVFVIDTGRTKENKYHESS 
OMSSIjVETFVSKAS ALQRQGRAGR VRDG FCFRMYTR ERFEGFMD 
YS VPS I LRVPLBELCLHIMKCNLGS P EDFLS KALDP PQLQV I SN 
AMNLLRKIGACE LNEPKLTPLGQHLAALP VNVKIG KML I FGAI F 
GCLDPVATLAAVMTEKSPFTTPIGRKDEADIJUCSAIAMADSDHL 
TIYNAYLGWKKARQEGGYRSEITYCRRNFLNRTSLIiTLEDVKQE 
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Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Ieoleucine, K«Lysine, 
L=Leucine, M=Methionine, N==Asparagine , 
P=Proline, QaGlutaraine, R=Arginine, 
S=Serine, ^Threonine, VoValine, 
W=Tryptophan, Y=Tyrosine, X -Unknown, *=Stop 
Codon, /^possible nucleotide deletion, j 
\=possible nucleotide insertion) 








LIKLVKAAGFSSSTTSTSWEGKRASQTI»SFQBlALLKAVIiVAGIi 
YDNVGKI IYTKSVDVTEKLACIVETAQGKAQVHPSS VNRDLQTH 
GWLLYQEKIRyARVYLRETTLITPFPVLLFGGDIRVQHRERLLS 
IDGWIYPQAPVKIAVIPKQLRVLIDSVLRKKLENPKMSLENDKI 
LQIITELIKTENN 


6144 


1289 
» 


568 


SG PGS MSGQRVDVKVVMLGKE Y VGKTS LVE R YVHDRPLVGP YQN 
VS AS GGARHGGRGS GGP VI CTYGPDLFPL VA \ TIGAAFVAKVMS 
VGDRTVTLG I WDTAGS KR YEAMSR I YYRGAKAAI VC YDLTDSS S 
F ERAKFVWKE LRSLEEG CQ I YLCGTKSDLL EEDRRRRRVD FHDV 
QD YADNI KAQL FETSS KTGQS VDELFXJKVABD YVSVAAFQVMTE 
DXGVDLGQKPNPYFYSCCHH 


6145 


1109 


196 


GGMDLSELERDNTGRCRL3SPVPAVCRKEPCVLGVDEAGRGPVI, 
GPMVYAICYCPLPRIJu^LEALKVADSKTLLBSERERLFAKMEDT 
DFVGWALDVLS PNL IS TS MLGRVKYNIiNSLSHDTATGL IQ YALD 
OGVNVTQVFVDTVGMPErYQARLQQSFPGIEVTVKAKADALYPV 
\VSAASICAKVARDQAVKKWQFVEKLQDLDTDYG\SGYPNDPQD 
/TKAWLKEHVEPVF\GFP\QFVRF\SWRTAQTI \LEKEAEDVIR 
EDSASENQEGLRKITSYFIiNEGSQARPRSSHRYFLERGLESTTS 
I* 


6l4(J 


428 


781 


LKKKGKEKAEAQQVEALPGPSIiDQWHRSAGEEEDGPVLTDEQKS 
R/ YPGHEAHDQGG\ WDARQSI IRKVVDPETGRTRLI KGDG BVLE 
EIVTKERHREINKQATRGDCLAFQMRAGLLP 


6147 


1 


2304 


GTRQLPPPSPGSGPGDSPEGPEGEAPERRRKAHGMLKLYYGLSE 
GEAAGRPAGPDPLDPTDLNGAHFDPEVYLDKLRRECPIiAQLMDS 
BTDMVRQIRALDSDMQTLVYENYNKF ISATDTIRKMKNDFRKME 
DEMDRIATNMAVITDFSARISATLQDRHERITKLAGVHALLRKL 
QFLFELPSRLTKCVEIX^YGOAVRYOGRAOAVLQOYQRLPSFRA 
IQDDCQVITARLAQQLRQRFREGGSGAPEQAECVELLLALGE PA 

SGFVGGLCQVAAAYQELFAAQGPAGAEKLAAFARQLGS RYFALV 
ERRLAQEQGGGDNSLLVPJUjDRFHPJUjRAPGALLAAAGLADAAT 
EIVERVARERI/SHHI^GIJaAAFLGCLTDVRQALAAPRVAGKEGP 
GUuTLU^VASSILSHIKASLAAVHLFTAKEVSFSNKPYFRGEF 
CSQGVREGLTVGFVHSMCQTAQSFCDSPGEKGGATPPALIiLLLS 
RLCLD YETATIS YI LTLTDEQFLVQDQFPVTP VS TLCAEARETA 
RRLIiTHYVKVQGLVISQMLRKSVETRDWLSTLEPRNVRAVMKRV 
VEDTTAIDVQVLPRtiAGVALTQAGGTVPSRGAGAAEDHWQSLPG 
GGDM C I WASHQAS S VARAS VREPQGNXS PRMNTKRAGE CL CPRS 
CSFSAQDYDIFAP I LP VEKQRLRVTQEVRAGLVLVLKI RPQTNS 
CILPLPHSTGS INSDHVPTK 


614B 


3054 


Is* 


VPAVGGTFADGAMGEAEKFHYIYSCDIiDINVQLKIGSLEGKREQ 
KSYKAVLEDPMLKFSGLYQETCSDLYVTCQVFAEGKPIiALPVRT 
SYKAFSTRWNWNEWUCLPVKYPDLPRNAQVALTIWDVYGPGKAV 
PVGGTTVS LFGKYGMFRQGMHDLKVWPN CRSQMDQKPTKTPGRT 
SSTIiSEDQMSRIAKLTKAHRQGHMVKVDWLDRLTFREIEMINES 
VKRS SN FM YLMGG FRCVKCDDKE YGI VY YEKDGDES S P I LTS FE 
LVKVPDPQMSLENLVESKHHNLPRSIJR5GPSDHDLKPYPSPRDQ 
LKNIVS YPPSKPPTYEEQDLVWEFRYYLTNQDKAIiTKI LTS VI W 
DL PQGAKQ ALALLG KWKPMD VE DS LELLS SHYTNPT VRR YAVAR 
LRQADDEDLLMYLLQLVQALKYENFDDIKNGLEPTKKDSQSSVS 
ENVSNSGINSAEIDSSQIIT/SAPFPSVSSPPP\ASKTKEVPDG 
ENLEQDLCTFLISRASKNSTLANYLYWYVIVECEDQDTQQRDPK 
THEMYLNVMRRFSQALLKGDKS\HlVmSLl^QQTFVDRLV^M 
KAVQRESGNRKKKNERLQALLGDNEKMNLSDVELIPLPLEPQVK 
IRGI IPETATLFKSALMPAQLFFKTEDGGKYPVIFKHGDDLRQD 
QLI LQ 1 1 SLMDKLLRJCENLDLKLTPYKVLATSTKHGFMQFI QSV 
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Amino acid segment containing signal peptide 
(AaAlanine, C»Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KoLysine, 
IjoLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=»Arginine, 
S»Serine, T=Threonine, V»Valine, 
WaTryptophan, Y=Tyrosine, X»Unknovn, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PVABVLDTEGSIQNFFRKYAPSENGPNGISAEVMDTYVKSCAGY 
CVITY ILGVGDRHLDNLLLTKTG KLFH I DFG YILGRD PKPLP PP 
MICIiNKEMVEGMGGTQSEQYQEFRKQCYTAFLHLRRYSNLILNLP 
SLMVDANIPDIALBPDKTVKKVQDKPRLDLSDERAVHYMQSLID 
BSVHALFAAVVBQIHKFAQYWRK 


6149 


l 


1413 


RVDPRVRBNGTANP I KNGKTS PAS KDQRTGKKTS VQGQVQKGND 
ESESDPESDPPSPKSSEEEEQDDEEVLQGEQGDPNDDDTEPENL 
GHRPLLMDSEDEEEEEKHSSDSDYBQAKAKYSDMSSVYRDRSGS 
GPTQDLNTILLTSAQLSSDVAVETPKQEPDVFGAVPFPAVRAQQ 
PQQEKNBKNLPQHRFPAAGLEQEEFDVFTKAPFSKKVNVQECHA 
VGPEAHTIPGYPKSVDVFGSTPFQPFLTSTSKSESNEDIiFGLVP 
FDErTGSQQQKVKQRSLQKLSSRQRRTKQDMSKSNGKRHHGTPT 
STKKTTjKPTYRTPERARRHKKVGRRDSQSSNEFLTISDSKENIS 
VALTDGKDRGNVLQPEESLLDPFGAKPFHSPD\tSWHPP\HQGL 
S\DIRADHNT\VLPGR\PRQN5IiHGSFHSADVLKMDDFGAVP/F 
LTELWQSITPHQSQQSQPV\ELDPFGAAPFPSKQ 


6150 


372 


37 


MSNtKKYIIDYDWKAStfitEiDHDVMTEBKiHQINNFWSDSEYR 
LN KHG SVLNAVL I MLAQ HALL I AI S SDLNA YG WCE FD WN DGNG 
QEGWPPMDGSEGIRITDIDTSGIF 


6151 


1555 


521 


DSNQQSVSGTAASTLLHS FKATIYYQGTGHVQQFYGVTS PYSQT " 
TPPIVC^YAQPSLQYIG^CXJIFTAHPQGVVVQPAAAVTl'I VAPG 
QPQPLQPSBMWTNNLLDLPPPSPPKPKTIVLPPNWKTARDPEG 
KI YYYHVITRQTQWDP PTWES PGDDAS LEHEAEMDLGTPT YDEN 
PMK\ASKKPKTAEADTSSEIJWCKSKEVFRKEMSQFIVQCLNPYR 
KPDCKVG \ R ITTTEDFlGiliARKLTHGVMNKELKYCKNPE \ DLEC 
NENVKHKTKEYIKKYMQKF<3AVYKPKEDTEFRVTVGPGWEDGWS 
GKTDSRERKSCGPFCSTPVSTVLLMIHHPGBFNPADVN 


6152 


1366 


648 


NRTWSTPSTWMGVALPPLCSTGPWPVTRQITARTTCGAVPAKCP 
PWC/DVHEPRCQPPDCHGHGTCVDGHCQCTGHFWRGPGCDELDC 
GPSNCSQHGLCTKTGCRCDAGWTGSNCSEECPLGWHGPGCQRPC 
KCEHHCPCDPKTGNCSVSRVKQCLQPPEATLRAGELSFFTRTAW 
LALTLAIJIFLLLISTAA^SI^LSRAERNRRLHGDYAYHPI^EM 
NGEPLAAEKEQPGGAHNPFKD 


" *153 


2 


3368 


GRVGARSPGRAYALLIiLIilCFNVGSGIiHLQVLSTRNENKLIiPKH 
PHLVRQKRAWITAPVALLEGEDLSKKNPIAKIHSDLAEERGLKI 
TYKYTGKGITEPPFGIFVFNKDTGELNVTSILDREETPFFL.LTG 
YALDARGNNVEKPliELRIKVLDrNDNEPVFTQDVFVGSVEELSA 
AHTLVMKINATDADEPNTLNSKISYRIVSLBPAYPPVFYLNKDT 
GEIYTTSVTIiDREEHSSYTLTVEARDGNGEVTDKPVKQAQVQlR 
ILDVNDNI PWENKVLEGM VE BNQVNVE VTRIKVFDADE I GSDN 
WLANFTFASGNEGGYFHIETDAOTNEGIVTLIKEVDYEEMKNLD 
FSVIVANKAAFHK3IRSKYKPTPIPIKVKVKNVKEGIHFKSSVI 
S I YVS ESMDRSS KGQ 1 1 GNFOAFDEDTGLPAHAR YVKLEDRDNW 
1 3VD S VTSE I KLAKLPDFESR YVQNGTYTVKI VAI S EDYPRKT I 
TGTVLINVEDINDNCPTLIEPVQTICHDAEYVNVTAEDLDGHPW 
SGPFSFSVIDKPPGMAPKWV T APOPGTC\rr.7iTncT?irxrr mocTrt 

FL ISDNQGFSCPBKQVLTLTVCEVLHGS \ GCREAQHDS YVGLGP 
AAIALMILAFLLLLLVPLLLLMCTCGKGAKGFTPIPGTIEMLHP 
WNNEGAPPEDKWPSFLPVDQGGSLVGRNGVGGMAKEATMKGSS 
SASrVRGQHEMSEMDGRWEEHRSLLSGRATQFTGATGAI \MTTE 
TT ITARATGASRDVAGAOJ^AVALNEEFLKN Y FTDKAAS YTEED 
ENHTAKDCIjLVYSQEETESLNAS IGCCSFIBGELDDRFLDDLGL 
KFKTIAE VCLGQKI D 1 NKE I EQRQKPATE TS MNTASHSLCEQTM 
VKSENTYSS GSSFPVPKSLQEANAEKVTQE I VTERSVSSRQAQK 
VATPLPDPMASRNV1ATETSYVTGSTMPPTTVILGPSQPQSLIV 
TER\Hf APASTLVDQP YANEGTVVVTERVI QPHGGGSNPLEGTQH 
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Amino acid segment containing signal peptide 

Glutamic Acid, F-Phenyl alanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamirie , RaArginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /«possible nucleotide deletion, 
\«possible nucleotide insertion) 








LQDVP YVMVRER3S FLAPS SGVQPTLAMPNIAVGQNVTVTERVL 
APASTLQSSYQIPTENSMTARNTTVSGAGVPGPLPDFGLESSGH 
SNSTI TTSSTRVTKHSTVQHS YS 


OX34 


3660 


2146 


KKKTKMKNTLQKTVNFGAWPKPT:iSDKSHLLQMVSKLI)LTDAKN 
SDTAH X KS IE I TS I LNG LQ ASES S AEDSEQED ERGAQDMDNNGK 
EESKIDHLTNNRNDLISKEBQNSSSLLEENKVHADLVISKPVSK 
S P ERLRKD I EVLS EDTDYEEDEVTKKRKDVKKDTTDKS S K PQ I K 
RG KRR YCNTBECLKTGS PGKKEEKAXNKESLCMENSSNSSSDED 
EEETKAKMTPTKKYNGLEEKRKSLRTTGFYSGFSKVABKRIKLL 
NNSDERLQNSRAKDRKDVWSS IQGQWP KKTLKELFSDSDTEAAA 
SPPHPAPEBGVAEESLQTVAEEBSCSPSVELEKPPPVNVDSKPI 
EBKTVE VNDRKAEFPSS GSNFSA* I PLPYLHLNRLHQSL* QKGS 
RQQS S VTVSEPLAFNQEEVRS I KS ETDST IE VD S VAGE LQDLQS 
ERE*LASRF*CQCELEQ* * SARTRTS* KSLYRSEKSERCSGRRK 
FX KKAE KKP * SNSGKQQKEG K 


6155 


869 


121 


HLLPEIiR3KSWITMKYVFYLGVIAGTFFFADSSVQKEDPAPYE^~ 
YLKSHFNPCVGVLI KP S WVLAPAHCYL PNLKVMLGNFKS RVRDG 
TEQT IN PIQ I VR YWNYS HS A PQD DLM L I KLAKPAMLNP KVQ ALN 
P\PTTNVRPGTVCLLSGLDWSQENSGRHPDLRQNLEAPVMSDRE 
CQKTEQGKSHRNS LCVKFVKVFSR I FGEVAVATVI CKDKLQGIE 
VGHFMGGDVG I YTNVYKYVS W I ENTAKDK 


6156 


5725 


3984 


GTST VTMATKKHFS 1 1 LNLLGML L KKDNQDTR KLLMTWALEVAV 
VMKKSETYAPLFCLPSFHKFCKGLLADTLVBDVNICLQACSSLH 
ALSS SLPDDLLQRCVDV CRVQLVHRGTCI RQAFGKLLKS I PLG V i 
FLSNNNHTEI QE I SLALRS HMSKAPSNTFHPQDFSD/ VI S F I LY ! 
GNSHRTGKDKWLERLFYS CQRLDKRDQSTIPRNLL KTDAVLWQW ! 
AIWEAAQFTVLSKLRTPLGRAQDTFQTIEGIIRSLAGHTLHPDQ : 
DVSQWTTADNDEGHGNNQLRLVLLLQYIiENLEKLMYNAYEGCAN i 
ALTS PP KVIRTFL YTNRQTCQDWLTRI RLS IMRVGLLAGQPAVT ! 
VRHG FPLLTEMKTTS LSQGNELEVS IMMWEALCELHCPEAIQG 
IAVWSSS IVGKHLLWINS VAQQAEGRFEKASVEYQEHLCAMTGV 
DCCISSFDKSVLTLASAGCKSASLKHCIiNGESRXSVLSKPTDSS 
PEVINYLGNKACECYISTADWAAVQEWQNAIHDLKKSTSSTSIiN 
LKAD FNY X KSLS S FESGKFVECTEQLELLPGENINLLAGGS KEK 
IDMKKLLRNM 


6157 


946 


329 


MANRGPSYGLSREVQEKIEQKYDADLENKLVDWI ILQCAEDIEH 
PP PGRAHFQKWLMDGTVLCICi INSLYP PGQEP I P KI SESKMAFK 
QMEQ I SQFLKAAETYGVRTTD I FQTVDLWBGKDMAAVQRTLMAL 
GSVAVTKDDGCYRGEPSWFHRKAQQNRRGFSEEQLRQGQNVIGL 
QMGSNKGASQAGMTG YGM PRQ I M* DAAS CP 


6158 


441 


1482 


LGSLIVLSLHCKVrFSSQSLERAMKEKAVDLVPILAQNPGLAQN 
P ILEG KDHNQNTG VD P 1 1 DHVQDRKTD / S RSKS PHKKRS KS RER 

DITCD BPCWOPriVPVIVT'OPtf Tlf1?ITDPTfyt'gr\DPVPP on ntrDnntm 

KKoKoK^n^KiJAKJUJ lKbxvl KbKKKVKJLKUHEKoREREKEREKE 

KERGKNKDRDKBREKDRBKDKEKDREREREKEHEKDRDKEKEKE 
QDKE KEREKDRS KE I DEKRKKDKKSRTP PRSYNASRR5RSS SRE 
RRRRRSRSS SRSPRTSKT3KRKSSRSPS PRSRNKKDKKREKERD 
HXSERRERERSTSMRKSSNDRDGKEKLEKNSTSLKEKEHNKEPD 
S S VS KEVDDKDAPRTE EN KI QHNGN CQ LNE ENLS TKTEAV 


6159 


53 • 


84 


AVIAPLHISLGDRARPYLKNTEKSSTTCSRRRNQSFPPVMSLTH 
RLHLCKYWGCAVSNVCRFWEGRPLPLMIWPYTLPVSLPVGSCV 
IITGTPILTFVKDPQLEVNFYTGMDBDSDIAFQFRLHFGHPAIM 
NSCVFGIWRYBEKCYYLPFEIX3KPFELCIYVRHKEYKVMVNGQR 
I YNFAHRF PPAS VKMLQVFRD I SLTRVLX SD* GRCVRITAVQEF 
DVS VS CDCTTAYQPG 


6160 


1626 


1790 


AGAKFFP* F * KVADAQPTES EKE I YNQVNWLKDAEG I LEDLQS 
YRGAGHEIREAIQHPADEKLQEKAWGAWPLVGKLKKFYEFSQR 
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Amino acid segment containing signal peptide 
(AaAlanine, OCysteine, D«Aspartic Acid, E» 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K^Lysine, 

xj3 iicuwiiic t n^ric union i nc ; n ^ASpcUTay ins 9 

P= Proline, Q=Glutamine, RoArginine, 
S^Serine, T=Threonine , VeValine, 
W*Tryptophan, Y=Tyrosine, X^Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEAALRGLLQALTSTPYSPTQHLEREQALAlCQFABIIiHFTLRPD 
ELKMTNPAIQNDFS YYRRTLSRMR INNVPAEGBNEVNNELANRM 
S LF YAEATPMLKTL S DATTKFVS ENKNLP IENTTDCLS TMASVC 
RVMLETPEYRSRFTNEETVSFCLRVMVGVI ILYDEVHPVGAFAK 
TSKIDMKGCIKVLKDQPPNSVEGLLNAIJlYTTKHLNDErTSKQI 
KS MLQ * QLLTLVNKG 


6161 




1569 


PVSGSES SLRRAWAS I LRLMLGPRVAVSI LCEDGISH* XiLEKH* 
KSHVLEPI^SLAI^EQCLALSLDWSTCKTGRAGDQPLKIISSDS 
TGQLHLLMVNETRPRLQKVAS W I VYS 
GGDD GLLRG WDTRV PG KFLFTS KRHTMGVCS I QSS PHREHILAT 
GS YDEHILLWDTRNMKQPLADTPVQGGVWRI KWHPFHHHLLLAA 
CMHSGFKILNCQKAMEERQEATVLTSHTLPDSLVYGADWSWIjLF 
RS LQRAPS WS FP SNLGTKTADLKGAS ELPTP CHECRBBNDGEG H 
ARPQSGMKPLTEGMRKNGTWLQATAATTRDCGVNPEEADSAFSL 
LATCSFYDHALHLWEWEGN 


6162 


1 


586 


RT XHATGRAG AS PMHRL I VW RLAEANKQHVRCQKCLE FGH WTYE 
CTGKRKYLHRPSRTAELKKALKEKENRLLLOQS IGETNVERKAK 
KKRSKSVTSSSSSSSDSSASDSSSESEETSTSSSSEDSDTDESS 
SSSSSSASSTTSSSSSDSDSDSSSSSKQ*HQHR*QL*R*TTKEE 
EKEIELLHSYWTDGLKTLM 


6163 


1081 


•78* 


RIRSTTEGCAVRLHPTQNTGKAR1MILLSVSLGRHWAFTYKFFL 
TPVV^FFPFPFHRKS*VMQKNPMKSREDEWMEKLNNLHVQRAD 
MNRLIMNYLVTEGFKEAAEKFRMESGIEPSVDLETLDERIKIRE 
MILKGQIQEAI ALINS LHP ELLDTNR YL YFHLQQQHLIE L I RQR 
ETEAALEFAQTQLAEQGEESRECLTEMERTLALLAFDSPEES PF 
GDIiLHTMQRQKWSEWQAVIiDYENRESTPKLAKLLKLLLWAQN 
ELDQKKVKYPKMTDLSKGVI EEPK 


6164 


• 


406 

r 


PCQS PGRS RMRQDKLTGSLRRGGRCLKRGGGGVGT ILSNVLKKR 
SCIS RTAPRJ^CTLEPGVDTKLKFTLEPSLGQNGFQQWYDAIjKA 
VARLSTGIPKEWRRKVWLTLADHYLHS IAIDWDKTMRFTFNERS 
NPDDDSMGIQIVKDLHRTGCSSYCGQEAEQDRVVLKRVLLAYAR 
WNKTVGYCQGFNIIiAALILEVMEGNEGDALKIMIYLIDKVLPES 
YFVNNLRALSVDMAV FRDLLRM KLPELSQHLDTLQRTANKES GG 
GYEPPLTNVFTMQWFLTLFATCLPNQTVLKIWDSVFFEGSE I IL 
RVSLAI WAKLGE Q I E CCETADE FYSTMGRLTQEMLEMDLLQSHE 
LMQTVYSMAPFPFP QLAELRE KYTYNI TPPPATVKPTS VSGRHS 
KARDSDEENDPDDEDAVVNAVGCLGPFSGFLAPELQKYQKQIKE 
PNEEQSLRSNNIAELSPGAINSCRSEYHAAFNSMMMERMTTDIN 
ALECRQYSRIKKKQQQQVHQVYIRADKGPVTSILPSQVNSSPVIN 
HIiLI^KKMKMI79RAAKNAVIH IPGHTGGKIS PVP YEOLKTKLNS 
P WRTH I R VHKKNM PRTKSH PG CGDTVG L I DE QNEAS KTNGLG AA 
EAFP5GCTATAGREGS S PEGS TRRTIEGQS PEPVFGDADVDVSA 
VQAKLGALELNQRDAAAETELRVHPPCQRHCPEPPSAPBENKAT 
SKAPQGSNSKTPI FS PFPSVKPLRKSATARNLGLYGPTERTPTV 
HTPOMSRSFSKPGGGNSGP*KMVFSSGTMI^ROLPGYPOF^nPKr 
GGERFG 


6165 


90 


405 


PCQS PGRS RMRQDKLTGSLRRGGRCLKRQGGGVGT I LS N VLKKR - 
SCISRTAPRLLCTLEPGWTKLKFTLEPSLGQNGFQQWYDALKA 
VARLSTGIPKSWRRKVWLTLADHYLHS1AIDWDKTMRFTFNERS 
NPDDDSMGIQ I VKDLHRTGCSS YCGQEAEQDRVVLKRVLLAYAR 
WNKTVGYCOGF^I^AALILEVMEGNEGDALKIMIYLIDKVLPES 
YFVNNLRALSVDMAVFRDLI^KLPELSQHIJDTLQRTANKESGG 
GYE P PLTNVFTMQW FLTLFATCLPNQTVLKI WDS VF PEGS EX IL 
R VS LAI W AKLG EQI E CC ETAD BFYS TMGRLTQEMLENDLLQS HE 
LMQTVYSMAPFPFPQLAELREKYTYNITPFPATVKPTSVSGRHS 
KARDS DEBNDP DDEDA WNAVGCLG P FSGFLAPELQKYQKQI KB 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine, C=Cysteine, D=Aspartic Acid, Ea ' 
Glutamic Acid, F=Phenylalanine, G»Glycine, 
H=Histidine, Islsoleucine, K=Lysine, 
L=Iieucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R»Arginine, 
S=Serine, ToThreonine, V=»Valine, 
WoTryptophan, Y^Tyroeine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PNEEQSLRSNNIAEIiSPGAINSCRSEYHAAFNSMMMBRMTI'DIN 
ALKRQYSRIKKKQQQQVHQVYIRADKGPVTSILPSQVNSSPVIN 
HLLLGKKMKMTMRAAKNAVIHIPGHTGGKISPVPYEDLKTKliNS 
PWRTH 1 RVHKKNMPRTXSHPGCGDTVGLIDBQNEASKTNGIjGAA 
EAFPSGCTATAGREGSSPEGSTRRTIEGQSPEPVFGDADVDVSA 
VQAKLGALELNQRDAAAETELRVHPPCQRHCPE P PS APEENKAT 
SKAPQGSNSKTP IPS PFPS VKPLRKSATARNLGIiYGPTERTPTV 
HFPQMSRSFSKPGGGNSGP * KMVFSSGTWLSRQLPGYPQE YQRN 
GGERFG 


6166 


2 


1206 1 


HKLWRTVAMAGAEWKSLEEtLEKHLPt&DLQEVKRVLYGKBLRK 
LDLPREAFEAASREDFELQGYAFBAAEEQLRRPRIVHVGLVQNR 
I PLPANAPVAEQVSALHRRI KAI V E VAAMCG VN 1 1 C FQ EAWTM P 
FAFCTREKLPWTE FAESABDGPTTRFCQKLAKNHDM WVS PILE 
RD S EHGD VLWNTA W I SNS GAVLG KTRKNHI PRVGDFNESTYYM 
EGJTIX3HPVF0TQFGRIAVNICYGRHHPLNWLMYSINGAEIIFNP 
SAT I GAItS ESLW PI EARNAA IANHCFTCAINRVG TE HF PNEFTS 
GDG KXAHQDFG Y FYGSS YVAAPDS S RTPGLSRSRDG LL VAKLDL 
NLCQQVNDVWNFKMTGRYEMYARBIAEAVKSNYSPTIVKE*PAS 
VPALG 


6167 


1220 


1844 


YGIVTGPSLCAGDKQPKKQEKNPVLVSPEFVDEAIiCACEE'YLSN 
IAKMDIDKDLEAPLYLTPEGWSLFLQRYYQVVHEGAELRHIjDTQ 

vqrcedilqqlqawpqidmegdrniwivkpgaksrgrgimcmd 
hlee^u^vngnpvvmkdgks^vvqkyi brpllifgtkfdlrqwf 
lvtdwnpiitvwfyrdsyirfstqpfslknldk+aplyltpegws 
lflqryyqwhegaelrhldtqvqrcedilqqlqawpqidmeg 
drni w i vkpgaksrgrgimcmdhleemlklvngnp wmkdg kwv 
vqkyierpllipgtkfdlrqwflvtdwnpltvwfyrdsyirfst 

QPFSLKNIiDK 


6168 


84 


1392 


VWPVPSVSAMPPKKQAQAGGSKKAEQKKKEKIIEDKTFGLKNKK 
GAKQQKFI KAVTHQVKFGQQNPRQVAQSEAEKKLKKDDKKKE LQ 
ELNELFKPVVAAQfCISKGADPKSVVCAFFKQGQCTKGDKCKFSH 
DLTLERJlCEKRSVYIOARDEELEKDTMDNWDEKXLEEVVNKXHG 
EAEKKKPKTQI VCKHFLEAI ENNKYGWFWVCPGGGD I CMYRHAL 
PPGFVLKKKKKKKKKBDEISL*DLIERERSALGPNVTKITLESF 
LAWKKRKRQEKIDKLEQDMERRKADFKAGKALVISGREVFEFRP 
ELVNDDDEEADDTRYTQGTGGDEVDDSVSVNDIDLS L YIPRDVD 
ETGITVASLBRFSTYTSDKDBNKLSEASGGRAENGERSDLEEDN 
EREGTENGAIDAVPVDENLFTGEDLDELEEBLNTLDLEE 


6169 


112 


662 


APAAAMAERPEDLNLPNAVITRI IKEALPDGVN2 S KEARSAI SR 
AAS VFVLYATS CANNFAMKGKRKTLNAS D VLSAM EEME FQRFVT 
PLKEALEAYRREQKGKKEASEQKKKDKDKKTDSEEQDKSRDEDN 
DEDEBRLEEEEQNEEEEVDN* KGRETVAPWKVPLEMRRATCFCE 
AFPCWAE 


6170 


62 


667 


STKVMLPNTGRJ^GCTVFITGASRGIGKAIALKAAKDGANIVIA - " 
AKTAQPHP KLLGT I YTAAEE I EAVGGKALPC I VDVRDEQQI SAA 
VEKAIKKFGGIDILVlWASAISLTtmiDTPTKRLDLMMNVNTRG 
TYLASKACIPYLKXSKVAHIPNISPPLNLNPVWFKQHCGRW*W 
G * GDGLCLI CFELNIjCMSDVITI CT 


6171 


382 


941 


HFMQSDVEIiDCDIEPCGHTKFPPTLPLSTTVIVCSCHPVATAST 
MAEAFSKTTSEEDQS IQEPKEANSMTAQKQKK*GLRGSRRRHAN 
SGGDI FGDS FAAYF PRVLKQVHQALSLSQEAVSVMDSMVRDILD 
R I ATEAGHLAH YS KCVTI TSRD IRMAVCLLLPGXMG JCLAES QGT 
NATLRYTKSK 


6172 


651 


54 


GLCRAGGAHRFSRTHVEAALKMLRREARLRREYLYRKAREEAQR 
SAQERKERLRRAIiEENRLI PTELRREAIiALQGSLEFDDAGGEGV 
TSHVDDEYRWAGVEDPKVMITTSRDPSSRLKMFAKELKLVFPGA 
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i SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence " 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(AoAlanine, C-Cysteine, D=Aspartic Acid, £ a 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, G*Glutamine, R=Arginine, 
S=Serine, T=Threonine, VoValine, 
WaTryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\*possible nucleotide insertion) 








QRMNRGRHBVGAL\niACKANG VTDIiLVVHEHRGTP VGL I VS HI* P 
FGPTAYFTLCNVVMRHDI PDLGTMSEAXPHLITHGFSSRLGKRV 
SD I LRYL F P VPKDDS HRV ITFANQDDYIS FRHHV YKKTDHRNVE 
LTEVG PRFELKLYMI RLGTLEQEATADVEWRWHPYTNTARKRVF 
LSTE+AAPRPLGQLL 


6173 


3 


288 


SVDHREVQVLSQSMPLTPHQAVLRGBRPYMCVECGKCFGRSSHL 
LQHQRI HTGEKPYVCS VCGKAFSQS SVLS KHRTIHTGE KP YE CN 
ECGKAFRVSSDLAQHHKIHTGEKPHECLECRKAFTQLSKLIQHQ 
RIHTGERPYVCPLCG KAFNHST VLRSHQRVHTGEKPHRCNECGK 
TFSVKRTLLQHQRIHTGBKPYTCSSOGKAFSDRSVLIQHHNVHT 
GEKPYECSECGKTFSHRSTLMNHERIHTEEKPYACYECGKAFVQ 
HSHI»IQHQKVHRKL*PTCVLSVGSA1AGVPTSFSISVSTLERSP 
MCAVYVGR PS ARAQS LVNTGQFTQVRS PMS VM SVEKPLE 


6174 


1060 


959 


PRPPGKRWMVAGIiGNPGLPGTRHSVGMAVLGQLARRLGVAESWT 
RDRHCAADLAliAPLGD AQLVLLRPRRLFWANGRS VARAAE LFGL 
TAEEVYLVHDELDKPLGRLALKLGG3ARGHNGVRSCISCLNSNA 
MPRLR VG IGRPAHPBAVQAHVLG CFSPAEQ EL LPLLLDRATDL, I 
LDHIRERSQGPSLGP*H*WFSKKA 


6175 


2204 


334 


RYFRADPRSRSGQPRAEGI/3AFAEGPLRAMAAPVKGNRKQSTEG 
DALDPPASPKPAGKQNGIQNPISIjEDSPEAGGEREEEQEREEEQ 
AFLVSLYKFMKERHTP IERVPHLGFKQINLWKI YKAVEKIiGAYE 
LVTGRRLWKNVYNELGGS pgstsgatctrrhy* rlvlpyvrhlk 
GEDDKP LPTS KPRKQYKMAKENRGDDGATERP KKAKEERRMDQM 
M PGICTKADAAD PAPLPSQEPPRNS TEQQGLASGS S VS FVGASG C 
PEAYKRLLSSFYCKGTHGI^PIAKKKLLAQVSKVEALQCQEEG 
CRHGAE PQAS PAVHLPES PQSPKGLTENSRHRLT PQEGIjQAPGG 
5LREEAQ AG P C PAAP I FKGCFYTHPTEVLKPVSQHPRDFFSRLK 
DGVLLGPPGKEGLS VKBPQLVWGGDANRPSAFHKGGSRKG I LYP 
KPKACWVSPMAKVPAESPTLPPTFPSSPGLGSKRSLEEEGAAHS 
GKRIJlAVSPFLKEADAKKCGAKPAGSGLVSCIiLG PALGPVP PEA 
YRGTMLHCPLNFTGTPG PIiKGQAALP FSPIiVI PAFPAHFLATAG 

PSPMAAGL^a^FPPTSroSALRHRLCPASSAWHAPPVTTYAAPHF 
FHLNTKL 


6176 


1040 


402 


PLSALRAMAEVHVIGQIIGASGFSESSLFCKWGIHTGAAWKLLS 
GVREGQTQVDTPQIGDMAYWSHPIDLHFATKGIiQGWPRLHFQVW 
SQDSFGRCQIiAGYGFCHVPSS PGTHQliACPTWRPLGSWREQIAR 
AFVGGGPQLLHGDTIYSGADRYRLHTAAGGTVHLEIGLLLRNFD 
RYGVEC*GTLPPTS PPSTPRTPSDGGGWHSGQEHRL 


6177 


1400 


992 


VPIESIiVGKVHNFPLIAFYCCEKGKRQPHKSLHDRCFGEALDPN 
CSHCYLDQIKRSDFLGFSGYSPHFVAISTMSBHKMQPSSMQQAL 
PSQ*PYWTDPRPALVPCCSHRPDVHRSRPGPGLPGTSGCSDRPP 
VCPI 


6178 


1027 


254 


STQRGGIKGVARAASLVGRRRAGTGMALLLCLVCLTAALAHGClT" 

HCHSNFSKKFSFYRHHVNFKSWWVGDIPVSGALLTDWSDDTMKE 

LHLAIPAKITREiCLDQVATAVYQMMDQLYQGKMYFPGYFPWELR 
NT FREOVMT . T OMH TTF QT? T npnuo rviT sn ve»tt e f*mxr*m \ r> t r» * «» 
a x e ivct v w niixu'"** J. ASMUl/V-UilKU» 1 r y Ic 1 XauiNCTDSHVA 

CFGXNCESS AQWKS AVQGLLN Y IKNWHKQDTSMR PRS SAFS WPG 

THRAAPAFLVLPALRCLEPPHLANIiSLEDAA* CLKQH 


6179 


806 


276 


RGETREMAGNIiLSGAGRRLWDWVPLACRSFSLGVPRLIGIRLTL 
PPPKWDRWNEKRAMFGVYDNIGILGNFEKHPKELIRGPIWliRG 
W KGNE LQRC IPJCRKMVGSRMFADDLHNLNKRIR YLYKHFNRHG K 
FR*KRKLRTSEKAHLSPWRRETVLFPVRKRLCIFSVIKWGFFGI 


6180 


156 


1833 


DHHILKAASTTHVCARGN1 FAI PNTRCLEC* ATATPSSLECQN * 
SHLSLCPLPATTSGLTPNSMI PEKERQNIAERLLRVMCADLGAL 
SWSGKEFLKIiAQTLVDSGARYGAFSVTBILGNFOTIiAIjKHLPR 
MYNQVKVKVTCALGSNACLG I GVTCHSQ S VGPDSCYI LTA YQAE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CoCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, G=Glutamine, R=Arginine, 
S=Serine, ToThreonine, V^Valine, 
W=Tryptophan, YoTyrosine, X=Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








GNHIKSYVLGVKGADIRDSGDLVHHWVQNVLSBFVMSEIKTVYV " 
TDCRVSTSAFSKAGMCLRCSACALNSVVQSVLSKRTLQARSMHE 
VI ELLNVC EDLAGS TG LAKET FGS LEETS P P PCWNS VTDS LLLV 
HERYEQICEFYSRAKKMNLIQSLNKHLIiSNLAAILTPVKQAVIE 
LSNESQPTLQLVLPTYVRLEKLFTAKANDAGTVSKLCHLFLKAL 
KENPKVHPAHKVAM ILDPQQKLRP VPPYQHEEI IG KVCELINE V 
KESWAEBADFEPAAKKPRSAAVENPAAQEDDRIiGKNEVYDYLQE 
PLFQATPD LFQY WS CVTQKHTKLAKLAFWLLAV P AVGARSGCVN 
M CEQALL I KRRRLLS PEDMNKLMFLKSNML 


6181 


169 


1032 


TRTLLSPVLLPGPRWKPWRRRPMdPIALPAWLQPRYRKNAYLFI 
YYLIQFCGHSWI FTNMTVRFFSFGKDSMVDTFYAIGLVMRLCQS 
VSIaLELLHIYVGIESNHLLPRFLQLTERIIILFWITSQEEVQE 
KYWCVLFVFWNLLDMVR YT YS MLS V I G I S YAVLTWLSQTIjWM P 
I YPLCVIAEAFAIYQSLP YFESFGTYSTKLPFDIiS I YFPYVLKI 
YLMMLFI GM YFTYSHL YS ERRD I LG I F P I KKKKM*STAFQCDTR 
KDRLWIQCSK*NTGSILVEKFLVF 


6162 


1769 


1224 


AS* IDYQLNTLLKEFQLT^E^TKLRYLTCSLIEDMAAAYFPDCI 
VRPFGSSVNTFGKLGCDLDMFLDLDETRNLSAHKISGNFLMEFQ 
VKNVPSERIATQKILSVLGECLDHFGPGKrVGVQKILHARCPLVR 
FSHQASGFQCDLTTNNRIALTSSELLYIYGALDSRVRALVFSVR 
CWARAHSLTS3IP0AWITNFSLTMMVIFFLQRRSPPILPTLDSL 
KTLADABDKCVI EGNNCTFVRDLSRI KP SQNTETLELL LKEFFE 
YFGNFAFDKNS INIRQGREQNKPDS S PLY I QNP FETSLN I S KNV 
SQSQLQKFVDLARES AW I LQQEDTDRPS ISSNR PWGLVS LLLPS 
APNRKSFTKKKSNKFAIETVKNLLESLKGNRTENFTKTSGKRTI 
STQT 


6183 


1118 


452 


HLDRYIKSPGSGSSTPAPPSHLLLYLLHPQSTRTMGCCGCSRGC 
GSGCGGCGSSCGQCGSGCGGCGSGRGGCGSGCGGCSSSCGGCGS 
RCYVPVCCCKP VCS WVPACSCTS CGSCGGSKGGCGSCGG5 KGGC 
GSCGCSQSS CCKPCCCSSGCGSS CCQSSCCKPCCCQSSCCVPVC 
CQSSCCKPCCCQSNCCVPVCCQCKI * G SG PRPS GFS CLVKAFLM 
VP 


6184 


1 


*191 


IVTVREEDGAPAVAPPGVWSRANKRSGAGPGGSGGGGARGAEE 
EPPPPLQAVLVADSFDRRFFPISKDQPRVLLPLANVALIDYTLE 
FLTATGVQETF VFCCWKAAQ I KEHLLfCS KWCRPTS LNWR 1 1 TS 
ELYRS LGDVLRDVDAKAL VRSDFLLVYGD VISNirXITRALEEHR 
LRRKL* KNVSVMTMI FKES S PSHPTRCHEDNVWAVDSTTNRVL 
HFQKTQGLRRFAFPLSLFQGSSDGVEVRYDLLDCHISICSPQVA 
" QLFTDNFDYQTRDDFVRGLL VNEE I LGNQIHMHVTAKE YGARVS 
NLHMYSAVCADVI RRWVYPLTPEANFTDSTTQSCTHSRHNI YRG 
PEVSLGHGSILEENVLLGSGTVIGSNCFITNSVIGPGCHIEPGD 
NVVI#I>QTYLWQGVRVAAGAQIHQSI»LCDNAEVKERVTLKPRSVL 
rSQWVGPNITLPEGSVISLHPPDAEEDEDDGBFSDDSGADQEK 
DKVKMKG YNPA EVGAAGKG YLWKAAGMNMEEEE ELQQNL WGLKJ 
NMEEESESESEQSMDSEEPDSRGGSPQMDDIKVFQNEVLGTLQR 

MDSPLDSSRYCALLLPLLKAWS P VFRN YIKRAADHLEALAAI ED 
FFLEHEAI*GISMAKVLMAFYQLEILAEETILSWFSQRDTTDKGQ 
QLRKNQQLQRFIQWLKEAEEESSEDD 




791 


44 


PCTS CVLWATLHLPASTRKAPQAECGM IS ITEWQKIGVGITG FG 
IFFILFGTLLYFDSVLLAFGNLLFLTGLSLIIGLRKTFWFFFQR 
HKLKGTSFLIiGGVVIVLLRWPLLGMFLETYGFFSLFKGFFPVAF 
GFLGNVCNI PFLGALFRRLQGTSSMV* KTEMSSLNLDHWLKGAK 
REEWEPPPQSPALTHSPTYPGPPQVQKERMGAEQLTSNPQVDSR 
GCQEAEMQTPRRLGWGWYHTLTLYLWEEK 


6186 


569 


238 


VYGIDSSNTNTHGAEERNRKLKKHWKLCHAQSRLDVNGLALKMA 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A-Alanine, C»Cysteine, D-Aspartic Acid, E=» 
Glutamic Acid, ^Phenylalanine, G«Qlycine, 
H«=Kistidine, I=»Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
PsProline, Q=Glutamine , R»Arginine, 
SaSerine, T=Threonine, V=Valine, 
W*Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, Apossible nucleotide deletion, 
^possible nucleotide insertion) 








KERKVKNKVKNKADTEEVFNKSPTNQEKMPTSAILPD?SGSVIS 
NIRNQMETLHSQPHQEENLCFENSFSL1NLLPINAVEPTSSQQI 
PNRETSEANKERRKMTSKSSESNIYSPLTSFITADSELHDIIKD 
LEDCLMVGK4TCX3DLAPNTLRIPTSNSEIKGVCSVGCCYHLLSE 
EFENQHKERTQEKWGFPMCHYLKEERWCCGRNARMSACLALERV 
AAGQGL PTESIiFYRAVLQDl IKDCYGITKCDRHVGKI YS KCSS F 
LDYVRRSLKKLGLDESKLPBKI IMNYYEKYKPRMKELEAFNMLK 
VVLAPC IETLI LLDRLCYLKEQED I AWSAtiVKLPD PVXSPRC YA 
VIALKKQQ*FPLKQI IRCISL*DSAGCAEEVSVGDGGPAI*RDAP 
PSGSRVGSRYD 


6187 


1701 


771 


DAWGPETRLARILNPDSFIEPRPGRLPEiEATRPHMEPKASCPA ' 
AAPLMERKFHVLVGVTGS VAALKLPLLVSKLLDI PGLEVAWTT 
ERAKHFYS PQDIPVTLYSDADEWEMWKSRSDPVLHIDLRRWADL 
I»L VAPLDANTLGKVAS GI CDNLLTCVMRAV7DRS KPtiLFCPAMNT 
AMWEHP I TAQQVD QLKA FQYVE I P CVAKKL VCGDEGLGAMAEVG 

TlVDKVKEVLFQHSGFQQS*PGISV>5GVPIiYSEWVQAICSVl(MDV 
GKIGGYPHLLNGGPA^*SLPRG0A(^RIlNWTRRPRTJ5PRnD^ , T?a^ 
A 


6188 


238 


1534 


KGFV>TAGPU4AELQVSPQWKAPEMSQICLSCGHPSA*GPRWASW 
NIGVFICIRCAG1HRITLGVHISRVKSVNLDQWTQEQIQCT1QEMG 
NGKAhnUiYEAYLPETFRRPQIDPAVEGFIRDKYEKKKYMDRSLD 
INAFRKEKDDKWKRGSEPVPEKKIiEPVVFEKVXMPQKKEDPQLP 
RKSS PKSTAPVMDLLGIiDAPVACS IANSKTSNTLEKDLDLLASV 
PS PSSSGSRKWGSMPTAGSAGSVPSNLNLFPEPGSKSEE IGKK 
QLS KDS ILSLYGS QTPQMPTOAM FMAPAQMAYPTAYPS FPGVTP 
PNSIMGSMMPPP VGMVAQ PGASGMVAPMAMPAGYMGGMQASMMG 
VPNGMMTTQQAGYMAGMAA^^OT^^y^3VOP2l^VlI.nwK^ , ttimtwim 
AG MN FYGANGMMNYG Q S MS GGNEQAANQTL S PQMWK 


" *189 


1297 


793 


LGEPLGDLCELI PGDVQQLQMGEVHPGTGAQGSAAQSVAGEVQL 

TQLSHARQRPSC^SQLIALDLQHMDISRQPRWQHVQPVARQVQ 

RAQQAQLABGVAVHLWAGDAVVAEVELLQEVGGGKVFAANACDL 

WQDHEX3A^IAARQATGHAU}RVIVQVRRVQPLEAL*RVPSGLPR 

RVRAFMILHNQITGIGREDFATTYFLBELNLSYNRITS PQVHRD 

AFT*KLRI^SLDI,SGNRLHMLPPGIJPRNVHVLKVX^ 

GAIiAGMAQLRELYLTSNRLRSRALGPRAWVDLAHLQLLD I AGNQ 

LTEIPEGLPESLEYLYLQNNKISAVPANAFDSTPNIiKGIFXRFN 

KLAVG S WDSAFRRLKHLQ VLDIEGNLEFODI 3 KDRGRLGKEKE 

EEEBDEVEEEETR 


6190 


66 


1309 


ILVGNVSFLLS FAEYVCNCS WGSLNVNRCNQTTGQCE CRPGYQ 
GLHCETCKEGFYLNYTSGLCQPCDCS PHGALS I PCNSSGKCQCK 
VGVIGSICDRCQDGYYGFSKNGCLPCQCNNRSASCDALTCACLN 
CQENSKGNHCEECKEGFYQSPDATKECLRCPCSAVTSTGSCSIK 
SSELEPECDQCKDG YIGPNCNKCENGYYNFDS I CRKCQCHGHVY 
PVKTPKICKPESGECINCIiHNTTGFWCENCL^GYVHDLBGNCIK 
KVILPTPEGSTILVSNASLTTS VPTPVINSTFTPTTLQTI FS VS 
TSENSTS ALADVS WTQFNI I ILTVI I IVWLLMGFVGAVYM YRE 
YQNRKLNAPFWTI ELKEDNI 5 FSS YHDS I PNAD VSGLLEDDGNE 
VAPNGQLTLTTPIHNYKA 


6191 


1212 


1511 


VNLCHGGLLHLSTHHLGIKPSMH*LFFLMLSFPHLTPQQPKCPS 
MIDWIKKIWYIYTMEYYATIKRNBIMFFAGTWMEMEAIILSKLM 
QDYMFSLISGS 


6192 


3 


950 


TRGCGN KMAGKKNVLSSLAVYAEDSEPESDGEAG IEAVGS AAEE 
FCGGLVS DAYGEDD FS RLGGDEDGYEEEEDENSRQS EDDDS ETEK 
PBADDPKDNTEAEKRDPQBLVASFSBRVRNMSPDEIKIPPEPPG 
RCSNHLQDKIQKLYERKI KEGMDMNYI IQRKKEFRNPS I YEKLI 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEALAKAQKIEMDKLEK 
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S3Q 
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NO: 
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corresponding 
to first 
amino acid 
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amino acid 
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Predicted end 
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location 
corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, P« Phenylalanine, G=Glycins, 
H-Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, K=Methior.ine, N=Asparagine, 
P«Proline, Q=Giutamine, R=*Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Clnknown, *«stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AKKERTKIEFVTGTK1CGTTTNATSTTTTXASTAVADAQKRKSKW 
DSAI P VTT IAQ PT I LTTT AIL P AWTVTTS AS G S KTTVI S A VGT 
IVKKAKQ 


6193 


3 


| 950 


TRGCGNKMAGKKNVLS S LAVYAEDS 5 PESDG EAG t EAVGS AAEB 
KGGLVSDAYGEDD FSRLGGDEDG YE3BEDENSRQSEDDDS 3TEK 
PEADDPKDNTEAEKRDPQELYASFSERVRKMS PDBIKIP PE PPG 
R CSNHLQDKI QKLYBRKIKEGWDMN YI IQRJCKEFRNPS I YEKL I 

AKKERTKIE FVTGTKKGTTTNATSTTTTTASTAVADAQKRKSKW 
DSAI ?VTT I AQPTILTTTATLPAVVTVTTSASGSKTTVI SAVGT 
IVKKAKQ 


6194 


3 


950 


a. v.*? r* rvrartv* AJSJM V Li o o iu\ v X All U a a tr n oDwilAGIEAVGSAAEE 

KGGLVSDAYGEDDFSRLGGDBDGYEEEEDENSRQSEDDDSETEK 
PEADDPKDNTEAEKRDPQELVASFSERVRNMS PDE IKI PPEPPG 
RCSNHLQDKI QKL YERKI KEGMDMNY 1 1 QR KKE FRNPS I YE KLI 
QPCAI DELGTNYPKDMFD PHGWS EDS Y YE ALAKAQ KI EMDKLEK 
AKKERTKI EFVTGTKKGTTTNATS TTTTTAS TA VADAQKRXS KW 
DSAI P VTTIAQPTILTTTATLPAVVTVTTS ASGSKTXVI SAVGT 
IVKKAKQ 


6195 


736 


235 - 


"VANGLqSNMPKF YCDYCDT YLTHDS PS VRKTHCSGRKHKENVKD' " " 

yyqkwmeeqaqsl idkttaafqqgki p p tpfsapp pagam i pp p 
pslpgpprpgmmpaphmggppmmpmmgppppgmmpvgpapgmrp 
pmgghmpmmpgppmmrpparpmmvptrpgmtrpdr 


6196 


1512 


623 


KTGIO^AAYVRNILDNAEQVISNLEARNt^PRL^PL^EEDSH'" 

orllmglmvselkdhflrhlqgvekkkieqmvldyisklldi,ic 

** x v ** ■* ** " m\JiPt lmix anv una riasvj&j\HJL t AvcHI Ml KI LEATNS Lt 

flplppgfhtlhtilgvqciiplhnllhcidsgvllltetaviri. 
mkdldntekneklkfs 1 1 vrlppligqkicrlwdhpmssni isr 
nhvtrllqnykkqprnsminks sfs ve flplnyfi e i ltd i ess 
hqalypfeghdnvdaefveeaalkhtamllgl 


6197 


3 


819 


ADPEGTE3AVMSRYTRPPWTSLFIRNVADATRPEDLRREFGRYG 
P I VDVYI PLDFYTRRPRG FAYVQFE1)VRDAEDAL ynlnrkkvcg 
RQIE I Q FAQGDRKTPGQMKS KERHPCS PS DHRRSRS P SQRRTRS 
RSSSWGRNRRRSDSLKESRHRRFSYSQSKSRSKSLPRRSTSARQ 
SRTPRRNFGSRGRSRSKSLQKRSKSIGKSQSSSPQKQTSSGTKS 
RSHGRHSDS IARSPCKSPKGYTNFETKVQTAKHSHFRSHSRSRS 
YRHKNSW 


6198 


111 


1912 


SEAALSPSFISPACFLIiRKI.PAIjEDGTLPHPDTLGMNYEGARSE 
RENHAADDSEGGALDMCCSERI.PGLPQP IVMEALDBAEGLQDSQ 
REMPPPPPPSPPSDPAOKPPPRGAG«rH<;T/FVPQQr nr paacAPT 
I*ACGVLWFSGYGHI WSQNATNLVSS LLTLLKQLEPTAWLDSGTW 
GVPSLLLVFLSGGLVLVTTLVWHLLRTPPEPPTPLP PEDRRQSV 
SRQPSFTYSEWMEEKIEDDFLDLDPVPETPVFDCVMDIKPEADP 
TSuTVKSMGLQERRGSNVSLTLDMCTPGCNEEGFGYIjMSPREES 
ARE YLLS ASRVLQAEELHEKALDP FLLQAEF FE I PMNF VD PKE Y 
DIPGLVRKNRYKTI LPNPHSRVCLTSPDPDDPLSSY INANYTRG 
YGGEEKVYIATQGPIVSTVADFWRMVWQEHTPriVKITNIEEMN 
EKCTEYWPEEQVAYDGVEITVQKVIHTEDYRLRLISLKSGTEER 
GLKHYWFTSWPDQKTPDRAPPLLHLVREVEEAAQQEGPHCAPII 
VHCSAGIGRTGCF I ATS I CCQQLRQEG WD I L KTTCQLRQDRGG 
MIQHCEQYQFVHHVMSIiYEKQLSHQS PE 


6199 


144 


1211 


MARENGESSSSWKKQAEDIKKIFEFKETLGTGAFSEVVLAEEKA 
TGKLFAVKC I ? KKALKGKESS I ENE I AVLRKI XHENI VALED I Y 
ES PNHLYLVMQLVS GGELFDR I VE KGF YTEKDAS TLIRQVLDAV 
Y YLHRMG I VHRDLKP ENLLYYSQDEES KIMI S D FGLS KMEGKGD 
VMSTACGTPGYVAPBVLAQKPYSKAVDCWS IGVI AYTLLCGYPP 
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Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ToThreonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








FYDENDS KLFEQ I LKAEYE FDS P YWDDISDSAKDFI RNLMEKDP 
NKRYTCEQAARHPWtAGDTALWKNIHESVSAQIRKNFAKSKWRQ 
AFNATAVVRHmKLHI^SIJJSSNASVSSSLSIASQKDCASGTF 
HAL* 


6200 


702 


96 


L PEVPH S LRPR VKPHLCCAQ P AVR VMARLPKLAVFDIiDYTLWP F 
WVDTHVDPPFIIKSSDGTVRDRRGQDVRLYPEVPSVIjKRIjQSIiGV 
PGAAAS RTS BI EGANQ LLEIiFDLFR YF VHREI YPG SKI TH FERL 
0XJKTGIPFSQMIFFDDERRNIVDVSKIX5VTCIHIQNGMNLQTLS 
QGLETFAKAQTGPLRSSLEBS PFEA 


6201 


2B09 


2383 


GQTPRVRWKMRRSLRAGKRRQTAGRKSKSPPKVPIVIQDDSLPA 
GP P PQ IR I LKRP TSNGWSS PNS TSR PTLP VKSLAQREAEYAEA 
RKRILGSASPEEEQEKPILDRPTRISQPEDSRQPNNVIRQPLGP 
DGSQGFKQRR 


6202 


2 


426 


INADRAAVASSLLSRPTRKMAPQKDRKPKRSTWRFNLDLTHPVE 
DG I FDSGNFEQFIiREKVKVKGKTGNLGN WHX RRFKNKITWSE 
KQFSKRYLKYLTKKYLKKNNLRDWLRVVASDKETYSLRYFQISQ 
DKDBSESED * j 


6203 


419 


2550 


RCPR PPATAGAAASRPDRS P PS G I SGS EAAAGAGAAAPAS QHPA 
TGTGAVQTEAMKQ ILGVI DKKLRNLEKKKGKliDD YQERMN KGER 
LNQDQLDAVS KYQBVTNNLEFAKELQRSFMALSQDI QKTIKKTA 
RREQLMREEAEQKRLKTVLEH^YVIJDKLGDDEVRTDLKQGIjNGV 
P IhS EEELSLLDEFYKLVDFERDMSLRLNBQYEHAS IHLWDLLB 
GKEKPVCGTTYKVLKEIVERVPnSNYTO<3THT i IHnNnT.r , PirVBiv , R 

SAPAVEDQVPEAEPEPAEEYTEQSEVESTEYVNRQFMAETQFTS 
GEKEQVDEVTTVETVBWNSLQQQPQAASPSVPEPHSLTPVAQAD 
PLVRRQRVQDLMAQMQGPYNFIQDSMLDFENQTTiDPAIVSAQPM 
N PTQNMDMPQLVCPPVHS ESRLAQPNQ VPVQPEATQVPLVS ST5 
EGYTASQPIiYQPSHATEQRPQKEPI DQIQATI SLNTDQTTASSS 
LPAASQPQVFQAGTSKPLHSSG INVNAAPFQSMQTVFKMNAPVP 
PVNEPETLXQQNQYQAS YNQS FSSQPHQVEQTEI*QQEQLQTWG 
TYHGS PDQSHQVTGNHQQPPQQ^TGFPRSNQ P YYNSRGVSRGGS 
RGARGLMNGYRGPANGFRGGYDGYRPSFSNTPNSGYTQSQFSAP 
RDYS GYQRDG YQQNFKRGS G QS G P RGAPRGRGG PPRPNRGM PQ M 
NTQQVN 


6204 


2933 


787 


CTHNLISLLGGRAIiIHFNRFLNLKIQEGEAHNIFCPAYDCFQLV ™ 
PGDIIKSWSKEI4DKRYLQFDIKAFVENNPAIKWCPTPGCDRAV 
RLTKQGSNTSGSDTLSFPLLRAPAVDCGKGHLFCWECIjGEAHEP 
CDCQTWKNWLQKI TEMKPEELVGVS EAYEDAANCLWLLTNSXPC 
ANCKS P IQKNEGCNHMQCAKCKYDFCW I CLEEW KKHS FVHWE V I 
YRCTRYBVIQHVEEQSKEMTVEAEKKHKRFQELDRFMHYYTRFK 
NHEHSYQLEQRLLKTAKBKMEQLSRALKETEGGCPDTTFIEDAV 
HVLLKTRRILKCS YP YGFFLE P KSTKKE I FGLMQTDLEMVTEDL 
AQKVNRPYLRTPRHKI IKAACI>VQQKRQEFLASVARGVAPADSp 
EAPRRS FAGGTWDWBYLGFAS PEEYAE FQ YRRRHRQRRRGD VHS 
IiDSNPPDPDEPSESTLDIPEGGSSSRRPGTSWSSASMSVLHSS 
S IjRDYT PASRS E2JQDS IjQALSS LDEDD PN ILLAI QLSLQES GL& 
LDEETRDFLSNEASLGAIGTSLPSRLDSVPRNTDS PRAALSSSE 
LLELGDS LMRLGAEHD PFS TDTLSSKP I*S EARS DF C P S S S D PDS 
AGQDPN I NDNLLGNI MAWFHDMNPQS IAL I PPATTE I S ADS QLp 
CI KDGS EG VKDVELVLP EDSMFEDASVS EGRGTQ I EEWPLE EWI 
PGGGKQHPQAW 


6205 


1 


1200 


RAHRGKMALEVGDMEDGQLSDSDSDKTVAPSDRPLQLPKVLGGD " 
SAMRAFQNTATACAPVSHYRAVES VDS S EES FSDSDDDS CLWKR 
KRQKCFNP PPKPBPFQFGQSSQKPPVAGGKKINNI WGAVLQEQN 
QDAVATELGILGMEGTIDRSRQSETYNYLLAKKLRKESQEHTKD 
LDKELDEYMHGGKKMGSKBEBN3QGHLKRKRPVKDRLGNRPEMN 
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Ammo acid segment containing signal peptide 
{A«Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L»Bjjeucxne, M=Metnionine, N=Asparagme, 
PaProline, Q=Glutamine, R^Arginine, 
SsSerine, T-Threonine, V«Valine, 
W«Tryptophan, Y-Tyroaine, X*Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








YKGRYB I TAEDSQE KVADE I S PRLQB P KKDLIAR WR I 1 GNKKA 
I3LLMETAB VEQNGGLFIMNGSRRRTPGGVFLNLLKNTPS ISEE 
Q I KDI FYI ENQKE YENKKAARKRRTQVLGKKMKQAX KSLNFQBD 

DDTSRETPASDTNEALASLDESQBGHAEAKLEAEEAIEVDHSHD 
LDIF 


— 6206" 


1U 


1442 


IISERRERSOjHIiVCIRCSCDVVBMGSVLGLCSMASWIPCIjCGS "' 

apciicrccpsgnnstvtrliyaj^llvgvcvacvmlipgmeeq 
lnkipgfcenekgwpcnilvgykavyrlcfglampylllsllm 
ikvksssdpraavhngfwffkfaaaiaiiigaffipegtfttvw 
fyvgmagafcfiliqlvllidfahsknesnvekmeegnsrcwya 
allsatalnyllslvaivlffvyythpascs enkafis vnmiilc 
vgasvmsilpkiqesqprsgllq3svitvytmyltwsamtnepe 

TNCNPSLLSIIGYNTTSTVPKEGQSVQWWHAQGIIGLILFLLCV 
FYS91RTSNNSQVNKLTLTSDESTLIEDGGARSDGSLEDGDDVH 
RAVDNERDGVTYS YS FFHFMLFIASLY I MMTLTNW YRYBPS REM 
KSQWTAVWVKISS S W I GIVLYVWT1»VAPL»VLTNRDFD 


o^u / 


2924 


1471 


7VMAEAATPGTTATTSGAGAAAATAAAASPTPIPTVTAPSLGAG "' 

GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 

SWCKYFQRGYCIYGDRCRYEHSXPUCOEEATATEt/TTKSSLAA 

SSSLSSIVGPLVEMNTGEAESRNSNFATVGAGSEDWVNAIEFVP 

GQPYCGRTAPSCTEAPLQGSVTKE2SEKEQTAVBTKKQLCPYAA 

VGE CRYGBNCVYLHGDS CDMCGLQ VLHPMDAAQRSQH I KSCIEA 

HEKDMELSFAVQRSKDMVCGICMEVVYEKANPSERRFGILSNCN 

HTYCLKCI RKWRSAKQ FESK I IXSCPECRI TSNFVI PS E YWVBE 

KEEKQKL I LKYKEAMSNfCACR YFDEGRGS CP FGGNC FYKHAYPD 

GRREEPQRQKVGTS SR YRAQRRNH FWEL IEERENSNP FDNDEEE 

W?FSr/3EMLLMLLAAGGDDELTDSEDEWDLFHDEI,EDFYDLDL 




2924 


1471 


TVMAEAATPGTTATTSGAGAAAATAAAAS PTPIPTVTAPS LGAG 
GGGGGSDGSGGGMTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
S WCKYFQRGYCI YGDRCR YEHS K PIjKQBEATATELTTKS 5LAA 
SSSLSSrVGPLVEMNTGEAESRNSNFATVGAGSEDW VNAI EFVP 
GQPYCGRTAPSCTEAPLO^SWKEESEKEQTAVETKKQLCPYAA 
VGECRraENCVYLHGDSCDMCGLQVLHPMDAAQRSQHIKS CIEA 
HEKDMELSFAVQRSKDMVCGI CMEWYEKANPS ERRFG I LSNCN 
HTYCLKCIRKWRSAKQFESKIIKSCPECRITSNFVIPSEYWVEE 
KEEKQKLII#KYKEAMSNKACRYFDEGRGSCPFGGNCFYKHAYPD 
GRREEPQRQKVGTSSRYRAQRRNHFWELIEEREWSNPFDNDEEB 
VVTFELGEMIJ^MLLAAGGDDELTDSEDEWDLFHDELEDFYDLDL 


6209 


1758 


829 


ER LCF PCMQS KI YS YMSPNKCSGMRFP LQE ENS VTHHB VKCQGK 
PLAGI YRKREEKRNAGNAVRS AMKSEEQKI KDAR KGPLVP FPNQ 
KSBAAEPPKTPPSSCDSTNAAIAKQALKKPIKGKQAPRKKAQGK 
TQQNRKLTDFYPVRRSSRKSKAELQSEERKRIDELIESGKEEGM 
KI DL I DGKGRGVIATKQFSRGDFWE YHGDL I E I TDAKKRE ALY 
AQDPSTGCYMYYFQYI^KTYCVDATRETNRLGRLINHSKCGNCQ 
TKIiHDIDGVPHLILIASRDIAAGEELLYDYGDRSKASIEAHPWL 
KH 


6210 


3761 


387 


I FGMS KLRMVLLEDSGSADFRRHFVNL S PFT I TWLLLS ACPVT 
SSLGGTDKELRLVDGENKCSGRVEVKVQEEWGTVCNNGWSMEAV 
SVICNQLGCPTAIKAPGWANSSAGSGRIWMDHVSCRGNESALWD 
CKHDGWGKHSNCTHQQDAGVTCSDGSNLEMRLTRGGNMCSGRIE 
I KFQGRWGTVCDDNETO IDHAS VICRQLE CGSAVS FSGSS N FGEG 
SGPIWFDDLICNGNESALWNCKHQGWGKHNCDHAEDAGVICSKG 
ADLSLRLVDGVTECSGRLEVRFQGEWGriCDDGWDSYDAAVACK 
QLGCPTAVTAI GRVNASKG FGHIWLDS VS CQGHE PAVWQCKHHE 
WGKHY CNHNEDAGVTCSDGSDLELRLRGGGS RCAGTVEVEI QRL 
LGKVCDRG WGLKEADWCRQLGCGSAIiKTS YQVYS K IQATNTWL 
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Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, G=Glycine, 
HnHistidine, I^Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, NsAsparagine, 
P=Proline, Q=Glutamine, R=Arginine , 
S^Serine, T=Threonine, V^Valine, 
W*Tryptophan, YoTyrosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FLSSCNGNBTSLWDCKNWQWGGLTCDHYEEAKITCSAHREPRLV 
GGDI P CSGRVEVKHGDT WGS 1 CDS DFS LE AAS VLCR ELQCGTW 
S I LGGAH FGEGNGQ I WAEE FQ CEGHESHLSLCPVAPRPEGTCS H 
SRDVGWCSRYTEIRLVNGKTPCEGRVELKTLGAWGSLCNSHWD 
IEDAHVLCMLKCGVAIiSTPGG^FGKGNGQIWRHMFHCTGTEO 
KMGDCPVTALGASLCPSEQVASVICSGNQSQTLSSCNSSSLGPT 
RPTI PBESAVACI ESGQLRLVNGGGRCAGRVBI YHEGSWGTI CD 
DSMDLSDAHVVCRQLGCGEAINATGSAHFGEGTGPIWLDEMKCN 
GKBSRIWQCHSHGWGQQNCRHICEDAGVICSEFMSLRLTSEASRE 
ACAGRLEVFYNGAWGTVGKSSMSETTVGVVCRQLGCADKGKINP 
ASLDKAMSI PMWVDNVQCPKGPDTIiWQCPSSPWEKRLASPSEET 
WITCDNKIRLQEGPTSCSGRVEIWHGGSWGTVCDDSWDLDDAQV 
VCQQLG CG PALKAFKEAEFGQGTG PI WLNE VKCKGNES S LWDCP 
ARRWGHSE CGHKEDAAVNCTD I S VQKTPQKATTGRS 5 RQS 5 F I A 
VGILGVVLLAIFVALPFLTKKRRQRQRIAVSSRGENLVHQIQYR 
EMN S CLNADDLDLMNS SGGHS B PH 


6211 

• 


3761 


3 87 


I FGMS KLRMVLLEDSGSADFRRHFVNLS PfT 1 TWLLLSACF VT 
SSI^GTDKELRLVDGBJTKCSGRVBVKVQEEWGTVCNNGWSMEAV 
SVICNQLGCPTAIKAPGV3ANSSAGSGRIWMDIIV9CROKESALWD 
CKHDGWGKHSNCTHQQDAGVTCSDGSNLEMRLTRGGNMCSGRIE 
I KFQGRWGTVCDDNFNI DHASVI CRQLECGSAVS FSGSS N FGEG 
SGPIWFDDLI CNGNESALWNCKHQGWGKHNCDHAEDAGVI CSKG 
ADLS LRLVDG VTECS GRLEVRFQGEWGT I CDDGWDS YDAAVACK 
QLGCPTAVTAI GRVNAS KGFGH I WLDSVSCQGHE P AVWQ C KHHE 
WGKHYCNHNEDAGVTCSDGSDLELRLRGGGSRCAGTVEVEIQRL 
LGKVCDRGWGLKEAEWCRQLGCGSALKTS YQVYS KIQATNTWL 
FLS S CNGNETSLWDCKNWQWGGLTCDHYEEAK I TCS AHRE PRLV 
GGDI PCSGRVEVKHGDTWGS ICDSDFSLEAAS VLCRELQCGTW 
S ILGGAHFGEGNGQI WABEFQCEGHESHLSLCPVAPRPEGTCSH 
S RDVGWCS R YTE I RLVNGKTP CEG RVE LKTLGAWGS LCNSHWD 
.IEDAHVLCQQLKCGVAIiSTPGGARFGKGKGOIWRHMFHCTGTEQ 
HMGDCPVTALGASLCPSEQVASVI CSGNQSQTLSS CNSSSLGPT 
RPTI PBESAVACI ESGQLRLVNGGGR CAGRVB I YHEGSWGTICD 
DSWDLSDAHWCRQLGCGEAINATGSAHFGEGTGPIWLDEMKCN 
GKESRI WQCHSHGWGQQNCRHKEDAG VI CS EFMSLRLTS EASRB 
ACAGRLE VFYNG ANGT VGKS S MS ETTVG WCR QLGCAD KGX INP 
ASLDKAMSIPMVTVTDNVQCPKGPDTLWQCPSSPWEKRLASPSEET 
WITCDNKIRLQEGPTS CSGRVE IWHGG5 WGTVCDDS WDLDDAQV 
VCQQLGCX5PALKAFKEABFGQGMPIWLNBVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTDISVQKTPQKATTGRSSROSS FIA 
VG I LG WLLAI FVALFFLTXKRRQRQRLAVSSRGENLVHQ I Q YR 
EMNS CLNADDLDLMNS SGGHS E PH ' 


" 6212 


1 


1134 


LKWELRPGGAVMGTGRGAGTQAPR3CCCQTNPGPPSSLRRAFRR ' 
RELPFPACIHEIGLG^VEAGSGPPPAPAARESRSRAMEEEASSPQL 
GCSKPHLBKLTLGITRlIiBSSPGVTEVTIIEKPPAERHMISSWE 
Q KNNC VM PEDVKNF YLMTNG FHMTWS VK LDEH 1 1 PLGSMAI NS I 
S KLTQLT QS SMYS L PNAPTLADLEDDTHEAS DDQPE KPH FDS RS 
VI FELDS CNGSGKVCLVYKSGKPALAEDTE I WFLDRALYWHFLT 
DTFTA YYRLLITHLGLPQWQYAFTS YG I S PQAKQRVSMYKP ITY 

NTNLLTEETDSFVNKLDPSKVFKSKNKIVIPKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6213 


1 


1134 


LKWELRPGGAVWGTGRGAGTGAPRSCCCQTNPGPPSSLRRAFRR 
RELPFPACHEIGLGAEAGSGPP PAPAARESRSRAMEEEASS PGL 
GCSKPHLEKLTLGITRILESSPGVTEVTIIEKPPAERHMISSWE 
QK1WCVMPEDVKNF YLMTNGFHMTWSVKLDEHI IPLGSMAINS I 
SKLTQLTQSSMYSL PNAPTLADLEDDTHEAS DDQPEKPHFDSRS 
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1 Amino acid segment containing signal peptide 
(AaAlanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F«Phenyl alanine, G-Glycine, 
H-Histidine, I-Ieoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S^Serine, ^Threonine, VeValine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
Vpossible nucleotide insertion) 








VI FELDS CNGSG KVCLVYKSGKPALAEDTE I WFLDRAL YWHFLT 
DTFTAYYRLLITHLGLPQWQYAFTSYGISPQAKQRVSMYKPITY 
NTNLLTEBTDS FVNKLDPSKVFKSKNKIV1 PKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6214 


2 


460 


HBIAPSAIRRAARI^I^PARWQSRAAAPVfVRGFRTGWSFVG^ 
VLGTSAKRTRLFFFLSKMAASSRAQVLALYRAMLRESKRFSAYN 
YRTYAVRRIRDAFRENKNVKDPVE I QTLVNKAKRDLGVTRRQVH 
IGQLYSTDKLI IENRDMPRT 


6215 


2 


1849 


FVAGGPRGSGSAAETMPE IRVTPLGAGQDVGRSCI LVS 1AGKNV 
MLDCGUHMGFXIDDRRFPDFSYI l^JNGRtiTDFLDCVI ISHFHLDH 
CGALPYFSEMVGYDGPI YMTHPTQAI CPILLEDYRKIAVDKKGE 
ANFFTSQM I KDCMKKWAVHLHQTVQVDDELE I KAYYAGHVLGA 
AMFQIKVGSESVVYTGDYNMTPDRHLGAAWIDKCRPNLLITEST 
YATTI RDSKRCRERDFLKKVHETVERGGKVL I PVFALGRAQELC 
IIJLETFWERMNLKVPIYFSTGLTEKA^fHYYKLFIPWTNQKIR^CT 
FVQRNMFEFKHIKAFDRAFADNPGPMWFATPGMIiHAGQSLQIF 
RKWAGNEKNMVIMPGYCVQGTVGHKILSGQRBa^EMEGRQVljEVK 
MQ VE YMS FSAHADAKG I MQ LVGQAE P ES VLL VHG EAKKME FL KQ 

KIEQELRVNCYMPANGETVTLPTSPSIPVGISLGLLKREMAQGL 
LPEAKKPRLLHGTL1MKDSNFRLVSSEQALKELGLAEHQLRFTC 
RVHLHDTRKBQETALRVYSHLKgVLKDHCT/QHLPDGSVTVE SVL 
LQAAAPSEDPGTKVLLVSWTYQDEELGSFIjTSLLKKGLPQAPS 


6216 


11 


393 


QTTRPE PRNSAIjRQSRSKMAWGVS S VS RLLGRS RPQLGRPMS S 
GAHGEEGSARMWKTLTFFVAIiPGVAVSMLNVYLKSHHGEHERPE 
FIAYPHLRIRTKPFPWGDGNHTLFHNPHVNPLPTGYEDE 


6217 


9 


117B 


TRVGRGESGIiKMEVKPPPGRPQPDSGRRRRRRGbEGHDPKEP"EQ~ 
LRKLF IGGLS FETTDDSLREHFEKWGTLTDCVVMRDPQTKRSRG 
FGFVTYS CVEEVDAAMCARPHXVDGRWE PKRAVSREDS VKPGA 
HLTVKKIFVGGI KEDTEEYNLRDYFEKYGKIETIEVMEDROSGK 
KRG FAFVTFDDHDTVDK1 WQKYHTINGHNCEVKXALSKQEMQS 
AGSQRGRGGGSGNFMGRGGNFGGGGGNFGRGGNFGGRGGYGGGG 
GGSRGSYGGGIX3GYNGPGGDGGKYGGGPGYSSRGGYGGGGPGYG 
KQGGGYGGGGGYDGYNEGGNFGGQNYGGGGNYNDFGNYSGQQQS 
NYGPMXGGSFGGRS&GSPYGGGYGSGGGSGGYGSRRF 


6Z18 


13 OS 


906 


S CERRGFIMADDLKRFLYKXLPS VEGLHAZ WSDRDGVPVI KVA 
NDNAPEHALRPGFLSTFALATDQGSKLGLSKNKS 1 1 CYYNT YQV 
VQFNRLPLWS F IAS S SANTGLI VSLEKELAPLFEELR QWEVS 


6219 


2 


B90 


AGPGEGAGAGTRCAGAEAEMASAGGEDCESPAPBADRPHQRPFL 
IGVSGGTASGKSTVCEK1MELLGQNEVEQRQRKWILSQDRFYK 
VLTABQKAKALKGQYNFDHPDAFDNDLMHRTLKNIVEGKTVEVP 
TYDFVTHSRLPETTVVYPADVVliFEGILVFYSQEIRDMFHLRLF 
VDTDSDVRLSRRVLRDVRRGRDLBQIiTQYTTFVKPAFEEFCLP 
TKKYADVII PRG VDNM VAINL I VQH I QD I LNGDI CKWHRGGSNG 
RSYKRTFSEPGDHPGMLTSGKR5HLESSSRPH 


6220 


227 


764 


EQNIS LEWS CTI EKALADAKAL VERLRDHDDAAESLI EQTTALN 
KRVEAMKQ YQEEIQK LNEVARH RPRSTLVMG I QQENRQ IRELQQ 
EN KELRTS LEEHQS ALE LI MS KYREQMFRLLMAS KKDDPGI I M K 

LKEQHSKIDMVHRMKSEGFFLDASRHILEAPQHGLERRHLEANQ 
NVH 


6221 


98 


916 


RWIWDI^PVSDGLEIjRPKYNGILHCLTTIWKLIXjIjRGLYQGVTP" 
NIWGAGLSWGLYFVrraAIKSYKTEGRAERLEATEYLVSAAEAG 
AMTLCITNPLWVTKTRLMLQYDAVVNSPHRQYKGMFDTLVKIYTC 
YEGVRGLYKGFVPGLFGTSHGALQFMAYELLKLFCYKQHINRIjPE 
AQLSTVE YISVAALSKI FAVAATYP YQWRARLQDQHMFYSGVI 
DVITKTWRKEGVGGFYKGIAPNLIRVTPACCITFWYENVSHFL 
LDLREKRK 
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WO 01/53312 
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SEQ 
ID 
NO: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A-Alanine, C-Cysteine, D*Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HaHistidine, 3>Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
PaProline, Q=Glut amine, RoArginine, 
SaSerine, T=Threonine, V=Valine, 
WaTryptophan, Y=Tyrosine, X=Onknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 


6222 


2 


2116 


MARELRALLLWGRRLRPIJ 1 RAPALAAVPGGKPILCPR^ 
PRRNPAWSLQAGRLFSTQTAEDKEEPIiHSIISSTESVQGSTSKH 
EFQAETKKLLDIVARSIiYSEKEVFIJRELISNASDALEKLRJIKLV 
S DGQAL P E M E I HLQTN AE KGTI T I QDTG IGMTQEEL VS NLGTIA 
RS GS KAFLDALQNQ AEAS S KI I GQFGVGPYS AFMVADRVE VYSR 
SAAPGSLGYQWLSDGSGVPB1AEASGVRTGTKI I1HLKSDCKEF 
SSEARVRDWTKYSNFVSFPLYLNGRRMNTLQAIWMMDPKDVRE 
WQHEB FYR YVAQAHDKPR YTLHYKTDAPLNIRS I F YVPDMKPSM 
FDVSRELGSSVALYSRKVLIQTKATDILPKWLRFIRGWDSEUI 
PLNLSRELLOESALIRKLRDVLQQRLIKFFIDQSKKDAEKYAKF 
FEDYGLFMR3GIVTATEQEVKEDIAKLLRYESSALPSGQLTSLS 
EYASRMRAGTRiaYYLCAPNRHIJ^SPY^EAMKKKDTEVLFCF 
EQFDELTLLHLREFDKKKLI SVETDIWDHYKEEKFEDRSPAAE 
CLSEKETEEI^WNRNVIX3SRVTNVKVTLRLDTHPAMVTVI,EWG 
AARHFLRMQQLAKTQEERAQLLQPTLEINPRHAIjIKKLNQLRAS 
E PGLAQLLVDQIYEKAMIAAGLVDDPRAMVGRLNEIiLVKALERH 


6223 
6224 


3 
1 


715 
133 


dawartmagmvdfqdeeqvksflenmevecnyhc^kekdpdgcy 
rlvdylegirknfdeaakvlkfnceenqhsdscyklgayyvtgk 

GGLTQDL KAAARCFLMACEKPGKKS I AAOHNVGLLAHDGQVNBD 
GQ PDLGKARDYYTRACDGGYTSSCFNLSAflFLQGAPGFPKDMDL 
ACKYSMKAGDLG«IWACANASROTKLGDGVDKVE^ 
QQVHKEQQKGVQPLTFG 


6225 


3259 

- 


938 


LRTISSMAWGPLLLTLliAHCTGSWAQSVLTQPPSVSGARrPHBK 

LLSCHRIoAICKJoPFSVESRKTVMGPQGARRQAFLAFGDVTVDPT 

QKEWRLLSPAQRALYREVTLENYSHLVSLGILHSKPBIiIRRLEQ 

GEVPWGEERRRRPGPCAGIYAEHVLRPKNLGIiAHQRQQQLQFSD 

QSFQSDTAEGQEKEKSTKPMAFSSPPLRHAVSSRRRNSWEIES 

SO^RENPTEIDKVLKGIENSRWGAFKCABRGQDFSRKMMVIIH 

KKAHSRQKLFTCRECHQGFRDE3ALLLHQNTHTGEKSYVCSVCG 

RGFS I»KANLLRHQRTH3GE KPFLCKVOGRGYTSKSYLTVH krtw 

TGEKPYECQECGRRFNDKSSYNKHLKAHSGEKPFVCKECGRGYT 

NKSYFWHKRIHSGEKPYRCQECGRGFSNKSHLITHQRTHSGEK 

PFACRQCKQS FS VKGSLLRHQRTHSGEKPFVCKDCERSFSQKST 

LVYHQRTHSQEKPFVCPJBCGQGFIQKSTLVKHQITHSBEKPFVC 

KD CG RG F I Q KS TFTLHQRTHS EE KP YGCR ECGRRFRDKS S YNKH 

LRAH LG EKRFFCRDCGRG FTLKPNLT I HQRTHS GE KP FM CKQ CE 

KS FS LKANL LRHQ WTHS GERP FNCKD CGRGF I L KS TLL FHQ KTH 

SGBKPFICSECGQGFIWKSNLVKHQLAHSGKQPFVCKECGRGFN 

WKGNLLTHQRTHSGE KP FVCNVCGQG FS WKRS LTRHHWR IHS KE 

KPFVCQECKRGYTSKSDLTVHERIHTGERPYBCQECGRKFSNKS 

YYSKHIjKRHLREKRFCTGSVGEASS 


6226 


29 


266 


TKVSELLGGSQRJLFFLPLWRRLCRCGLGPRVSPMAdPRVEVDGS 
IMEGGGOSLRVSTGLSWLLSLPWRAQRIRAGRSYA 


6227 


2581 


890 


MS AS S LLEQRP KGQGNKVQNGS VHQKDGLNDDD FEP YIiS PQAR P ~ 
NNAYTAMSDSYLPSYYSPSIGFSYSLGEAAWSTGGDTAMPYLTS 
YGQLSNGEPEFLPDAMFGQPGALGSTPFU3QHGFN?KPSGIDFS 
AWGNWSSQGQSTQSSGYSSNYAYAPSSLGGAMIDGQSAFANETL 

nkapgmntidqgmaalklgstevasnvpkvvgsavgsgsitsni 
vasnslppatiappkpaswadiaskpakqqpklktkngiagssl 
p pp pi khnmd igtwdnkgp vakapsqal7qnigq ptqgspqpvg 
qqannsppvaqasvgqqtqplpppppqpaqlsvqqqaaqptrwv 

APRNRGSGFGHNGVDGNGVGQSQAGSGSTPSEPHPVLEKLRS in 

nynpkdfdwnlkhgrvfi iksyskddihrsikyni WCSTBHGNK 

RLDAAYRSMNGKGPVYLLFSVNGSGHFCGVAEWKSAVDYNTCAG 
VWSQDKWKGRFDVRWI FVKDVPNSQLRHIRLENNENKPVTNSRD 
TQEVPLEKAKQVLKI IASYKHTTS I PDDFSHYEKRQ 
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SEQ 
ID 

KTO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A=Alanine, CoCyeteine, D=Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, G»Glycine, 
Ht=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 


6228 


47 


197B 


GRRCRRRQAVMEI^QE^EX.GCWAVEKMGVPVAARAPESTLRRir" 
CLGQGADIWAY1LQHVHSQRTVKKIRGNLLWYGHQDSPQVRRKL 
ELEAAVTRLRAEIQELDQSLSLMERDTEAQDTAMEQARQHTQDT 
QRRALLLRAQAGAMRRQQHTLRDPMQRLQNOLRRLODMER KAKV 
DVT FGS L T S AALGL E P WLR D VRTACTL RAQ FLQN LLL P Q AKRG 
SLPTPHDDHFXSTSYQQWLSSVETLLT^PPGHVLAALEHJUAAER 
EAE IRS LCSGDGI/3DTEI S RPQA PDQSDS SQTLPSM VHL I QEG W 
RTVGVLVSQRSTLLKERQVLTORLQGLVEEVERRVLGSSERQVL 
I LGLRRCCLWTELKALHDQSQELQDAAGHRQLLLRELQAKQQR I 
LH WRQL VE BTQEQVRLLI KGNSAS KTRLCRS PGEVLALVQRKW 
PT FE AVAPQS RELLR CLEBEVRHLPH ILLGTLLRHR PGELKPLP 
TVLPS IHQLHPASPRGSSFIALSHKIX3LPPGKASELLLPAAA3 L 
RQDLLL LQDQRSLWCWDLLHMKTSLP PGLPTQELLQ I QAS QBKQ 
QKENLGQALKRLEKLLKQALER I PEIiQG I VGDWWEQPGQAALS B 
ELCQGLSLPQWRLRWQAQGALQKLCS 


6229 


1S71 


560 


GPSIiLGTRGTPNPARTLQIFFLIIGRRLTGRMAAVDDLQFEEF^" 
NAATSLTAN PDATTVN I EDPGETPKHQPGS P RGSGRE EDD ELLG 
NDDSDKTELLAGQKKSSPFWTPBYYQTFFDVDTYQVFDRIKGSL 
LPIPGKNFVRLYIRSMPDLYGPFWlCATLVFAIAISGNLSNFIiI 
HLGEKTYHYVPEFRKVS IAATI I Y AY AWL VP 1ALWG FLMWRNS K 
VMNI VS YS FliEI VCVYG YSLPrYI PTAILWI l p mravu w t t vm t 
ALG I SGSLLAMTFWPAVREDNRRVALATIVTI VliLHMLLSVGCL 
AYFFDAPEMDHLPTTTATPNQTVAAAKSS 


6230 


1723 


600 

X 


SKMSGRSGKKKMSKXiSRSARAQVlFPVGRLMRYLKKGTFKyRiS 
VGAP VYMAAVIEYLAAE I LELAGNAARDNKKAR IAPRHI LLAVA 
NDEKL^QLLKGVTIASGGVLPRIHPELLAKKRGTKGKSETILSP 
PPEKRGRKATSGKKGGKKSKAAKPRTSKKSKPKnqn^PfiTQWQ'r 

SBDGPGDGFTILSSKSLVLGQKLSLTQSDISHIGSMRVEGIVHP 
TTAE IDLKBDI GKALEKAGGKBFLETVKBLRKS QGPLEVAEAAV 
SQSSGLAAKFVIHCHIPQWGSDKCBEQLEETIKNCLSAAEDKKL 
KSVAFPPFPSGRNCFPKQTAAQVTLKAISAHFDDSSASSLKNVY 
FLLFDSES IGI YVQEMAKLDAX 


6231 


149 


870 "-■ 


LiFSSSTMDRSIJINVLWSFtiFLLLFTAYGGLQSLQSSLYSEEG 
LGVTALSTIiYGGMLLS SM FLPP LL I ERLGCKGTI ILSMCGYVAF 
SVGNFFASWYTLIPTSILLGLGAAPLWSAQCTYLTITGNTHABK 
AGKRGKDMVNQYFGIFFLIFQSSGVWGNLISSIiVFGQTPSQETL 
PEBQLTS CGASDCLMATTTTNSTQR PS QQLVYTIiLGI YTGS GVL 
AVLMIAAFLQPIRDVQRESE 


6232 
6233 


3679 
1 


1476 
2654 


^agttmagfwVgtaplvaagrrgrwppqqlmlsaalrtlkhvl 

YYSRQCLMVSRNLGSVGYDPNEKTFDKILVANRGEIACRVIRTC 
KKMGI KTVAIHSDVDASSVHVKMADEAVCVGPAPTSKSYLNMDA 
IMEAIKJCTRAQAVHPGYGFLSENKEFARC1AAEDWFIGPDTHA 
IQAMGDKI E5KLLAKKABVOTIPGFDGVVKDAEEAVR IARE IGY 
P VM I KAS AGGGGKGMR I A WDD E ETRDG FRLS S QEAAS S FGD D R L 
LTEKFI DNPRHIE I Q VLGDKHGN ALWLNERECS I QRRNQKWEE 
APSIFLDAETRRAMGKQAVAIjARAVKYSSAGTVEFLVDSKKNFY 
FLEMNTRLQVBHPVTECITGIiDLVQEMIRVAKGYPLRHKQADIR 
INGWAVECRVYAEDPYKSFGLPSIGRLSQYQEPLHLPGVRVDSG 
IQPGSDIS I YYDPMISKLITYGSDRTBALKRMADALDNYVIRGV 
THNIALLREVI INSRFVKGDI STKFLS DVYPDGFKGHMLTKS EK 
NQLLAIAS SLFVAFQIJRAQHFQENSRMPVTKPDIANWEIjSVICLH 
DKVHTWASNNGSV7SVEVDGSKIJ*VTST^^ 
QRWQCLSREAGG17MSIQFIX3TVYKVNILTRLAAELNKFMLEKV 
TEDT S S VLRS PMPGVWAVS VKPGDAVAEGQE I CVI EAMKMQNS 
MTAG KTGTVKS VHCQAGDTVG EGDLLVELE 
HSTRENLNAGNFNFPSEGHLVRSTGPGGSFAKHMVAQCVSPKGP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine, CaCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SsSerine, T=> Threonine, VaValine, 
W=Tryptophan, Y=Tyrosine, X=Unknovn, *«Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LACSRTYFFGATHVPiLGGDSKIiPKKTEQIRLLSQIYAAVIEAV 
LAGIACYAKTSSLTKAKEVAEQTLGSGLDSFELIPFKAALRSKM 
T FH IHAVNNQGR IVFLDS ED S hS FVKTACMAVYD I P D LLGGNGC 
LGSWFSES FLTSQ I LVKEKDGTVTTETS S WLTAAVPRFCS WL 
VEDNE V KL S E KTHQAVRGDE S FLGTYLTG GEGAYL YS S NLQS W P 
EEGNVHFFS SGLLFSHCRHGS 1 I ISKDHMNS ISFYDGDSTSTVA 
ALLID F KS S UiPHLFVHFHGSSNFIiM IALF PKS KI YQAFYS BV F 
SLWKQQDNSGISLKVIQBDGLSVEQKRLHSSAOKLFSALSQPAG 
EKRSSLKLLSAKLPELDWFLQHFAIS S I SQBPVMRTHLPVLLQQ 
AEINTTHRIBSDKVIISIVTGLPGCHASELCAFLVTLHKECGRW 
MVYRQIMDSSECFHAAHFQRYLSSALEAQQNRSARQSAYIRKKT 
RL L WL QG YTD V I D WQ ALQT H P DSNV KAS FTI G AI TACVE P M S 
CYMEHRFLFPKCIJ)QCSQGLVS>rVVFTSHTTEQRHPLLVQIjQSL 
IRAANPAAAFlLAENGXVTRNEDIBLILSENSFSSPEMbRSRYIj 
M YPGW YEGKLNAGS VY PIMVQ I CVWFGRPLEKTRFVAKCKAI QS 
SIKPSPFSGNiraiLGKVKFSDSERTMEVCYNTLANSLSIMPVIi 
EGPTPPPDSKSVSQDSSGQQECYLVFIGCSUCEDS IKDWLRQSA 
KQKPQRKALKTRGMLTQQEI RS I HVKR H LE PLPAGY FYNGTQFV 
NFFGDKTDFHPLMDQFMNDYVEEANREIBKYNQELEQQEYHDLF 
ELKP 


6234 


1731 


404 

- T 


PRVREDMOHKSPGNKGSLVYAGI KS IVKS9L#GMV£SS RHNWSQL 
D KQ 3 D I QNLNE ER I LALQLCGW I KKGTDVDVGPFLNSLVQEGB W 
ERAAAVALFNLDIRRAIQILNEGASSEKGDLNLNWAMAIiSGYT 
DEKNS^WRBT4CSTLRIjQIiNNPYLCVMFAFLTSETGSYDGVL»YBN 
KVAVRDRVAFACKFLSDTQLNRYIEKLTNEMKEAGNLEGI L»LTG 
LT KDGVDLME S YV DRTG D VQTAS Y CMT «QG S PLDVLKD ERVQYW I 
ENYRNLLDAWRFWHKRAEFD IKRSKLDPS SKPLAQVFVS CNFCG 
KS IS YSCSAVPHQGRGFSQYGVSGSPTKS KVTSCPGCRXPLPRC 
ALCL INMGTPVSSCPGGTKS DEKVDLS KDKKLAQFNNWFTWCHN 
CRHGGHAGHMLSWFRDHAECPVSACTCKCMQLDTTGNLVPAETV 
QP 


6235 


1 


57L 


EKRDHRLPSWPRAAIiKVPGRGGRVGTTPELAAGGIMATRNPPPQ 
D YESDDDS YE VLDLTE YARRHQWWNRV FGHS SGPMVBKYSVATQ 
IVMGGVTGWCAGFLFQKVGKIiAATAVGGGFLLLiQIASHSGYVGI 
DWKRVEKDVNKAKRQIKKRANKAAPEINNLIEEATEFIKQNIVI 
SSGFVGGFLLGLAS 


6236 


1 


703 


WDQNKGAAAGSGLTIiPSLPSARFSAGPPTQRSRPTMSNMEKHLF 
NLKFAAKELSRSAXKCDKEEKAEKAKIKKAIQKGNMEVARrHAE 
NAIRQKNQAVNFLRMSARVDAVAARVQTAVTtWKVTKSMAQVVK 
SMDATLKT^LEKISALMDKFEHQFETLDVQTQQMEDTMSSTTT 
LTTPQNQVDMLLQEMADEAGLDLNMELPGGQTGSVGTSVASAEQ 
DELSQRLARLRDQV 


6237 


312 


720 


PTAMAEEGIAAGGVMDVHTALQEVIjKTAIiIHDGLARGIREAAKA 

LDKROJUiLCVLASNCDEPMYVlCLVEAIiCABHQINLI 

GSWVGLCKIDREGKPRKVVGCSCVV^ 

CKK 


6238 


? 


4666 


EE VPTQES VKWEXNVI I KNP EI VF VADMTKNDAPALVI TTQCEI 
CYKGNLENS TMTAAI KDLQ VRACP FltP VKRKGKI TTVLQPCDLF 
YQTTQKGTD PQVT DMS VKS LTLKVS ?VI INTMITITSALYTTKE 
TI PEETASSTAHLWEKKDTKTLKMW51jEESNETEKIAPTTELVP 
KGEMIKMNIDSIFIVLBAGIGHRTVPMLLAKSRFSGEGKNWSSL 
INLHCQI^LEVHYYNEMFGVWEPLLEPLEIDQTEDFRPWNLGIK 
MKKXAKMAIVESDPEEENYKVPEYKTVISFHSKDQLNITLSKCG 
LWLNNLVKAFTSAATGSSADFVKDLAPFKILNSLGLTISVSPS 
DS FSVLNI PMAKS YVLKNGESLSMDYI RTKDNDH FNAMTS LSS K 
LFFILLTPVNHSTADKIPIiTKVGRRLYTVRHRESGVERSIVCQI 
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SEQ 
ID 

NO: 


" Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G«Glycirie, 
H=Histidine, I=Isoleucine, KaLysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, QaGlut amine , RaArginine, 
S=Serine, TaThreonine, V=Valine, 
WaTryptophan, Y«Tyrosine, X»Unknown, +*»Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








DTVEGSKKVTIRSPVQIRNHFSVPLSVYBGDTIiLGTASPBNEFN 
I PLGSYRSFI FLKPEPENYQMCEGIDFEEI I KNDGALLKICKCRS 
KNPSKESFLINIVPEKDNLTSLSVYSEDGWDLPYIMHLWPPILL 
RNLLPYXI AYYIEG IENS VFTLSEGHSAQI CTAQLG KARLHLKL 
LDYLNHDWKSEYHIKPNQQDISFVSFTCVTEMEKTDLDIAVHMT 
YNTGQTWAFHS PYWMVNKTGRMLQYKADG IHRKHP PM YKKPVL 
FS FQPNHFFNNNKVQLMVTDSELSNQFS 1 DTVGSHGAVKCKGLK 
MDYQVGVTIDLSSFNITRIVTFTPFYMI KNKSKYHI SVAEEGND 
KWLSLDLBQCIPFWPEYASSKLLIQVERSEDPPKRIYKNKQENC 
ILLRLDNELGGI IAEVNLAEHSTVITFLDYHDGAATFLLINHTK 
NELVQYNQSSLSEIEDSLPPGKAVFYTWADPVGSRRLKWRCRKS 
HGEVTQKDDMMMPIDLGEKTIYLVSFFEGLQRIILFTEDPRVFK 
VTYES EKAELAEQEIAVALQDVGI SLVNNYTKQE VAY IGI TSSD 
VWETKPKKKARWKPMSVKHTEKIiEREFKBYTESSPSEDKVIQL 
DTNVPVRLTPTGHNMKILQPHVIALRRNYLPALKVEYKTSAHQS 
SFRIQIYR1QIQNQ1HGAVFPFVFYPVKPPKSVTMDSAPKPFTD 
VS IVMRSAGHSQISR 1 KY F KVL IQEM DLRLDLGF I YALTDLMXE 
AE VTENTEVELFHKDIEAFKEE YKTASLVDQSQVSL YE Y FH I S P 
IKLHLSVSLSSGREEAKDSKQNGGLIPVHSLNI.LLKSIGATLTD 
VQDWF1CLAFFELNYQFHTTSDIK3S EVTKKYS KQAI KQMYVLIL 
GLDVLGNPFGLIREFSEGVEAFFYEPYQGAIQGPEEFVEGMAIiG 
LKALVGGAVGGLAOAASKITGAMAKGVAAMTMDEDYQQKRREAM 
NKQ P AGFREG I TRGG KGLVSGFVS G ITG IVTKP I KGAQ KGGAAG 
FFKGVGKGLVGAVARPTGG I IDMAS STFQG 1 KRATETS EVES LR 
PPRFFNEDGVIRPYRLRDGTGNQMLQKIQFYREWIMTHSSSSDD 
DDDDDDDDESDLNH 




2108 

■r 


634 


KPGMAGKGSSGRRPIiUW3LI*VAVATVHLVICPYTKVEES FNLQA j 
TKDLL YHW QDLE Q YDHLE FPGVVPRTFLGP WIAVFS S PAVYVL 1 
SLLEMSKFYSQLIVRGVl^I^IFGLWTI^KEVRRHFGAMVATM : 
F CWVTAMQ FHLM FYCTRTIj PNVLAIiP WLLAIiAAWLRHE WAR FI 
r WI*SAFAI I VFRVELCLFLGLL^LLALGNRKVSWRALRHAVPAG 
I LCLG LTVAVDS YFWRQLTW P EG KVL WYNTVLNKS SNWGTS P LL 
WYTYSALPRGLGCSLLFIPLGLVDRRTHAPTVI^ 
PHKELRFI IYAFPMLNITAARGCS YLLNNYKKSWLYKAGSLLVI 

DVAAAQTGVSRFLQVNSAWRYDKREDVQPGTGMiiAYTH I LMEAA 
PGIiliALYRDTHRVLAS WGTTGVS LNLTQLPPFNVHI,QTKIjVLL 
ERLPRPS 


6240 


2202 


1176 


HERGDS LKEPTS IAESSRHPS YRSEPSLEPESFRS PTFGKS FHF 
DPLSSGSRSSSIiKSAQGTGFEtiGQLQSIRSEGTTSTSYKSLANQ 
TRNGSLSYDSLLTPSDSPDF3SVQAGPEPDPPLGYTSPFLSARL 
AQQREAERHPRLVPTGPTHR2PSPVRYDNLSRHIVASLQEREXL 
LRQSPPLPGREEEPGLGDSGIQSTPGSGHAPRTSSSSDDSKRSP 
LGKTPLGRPAVPRFGKPDGLRGRGVGSPEPGPTAPYLGRSMSYS 
SQKAQ PG VSBTEE VAIjQ PLLTPKDE VQIiKTT YS KSKGQP KSU3S 
ASPGPGQPPLSSPTRGGVKKVSGVGGTTYEISV 


6241 


3 


1341 


RNAE E KKRLSLQRE KI IARVS I DNRTRALVQALRRTTD P K LC I T 
RVEBLTFHLLEFPEGKGVAVKERI1PYLLRLRQIKDETXQAAVR 
EILALIGYVDPVKGRGIRILSIDGGGTRGWALQTLRKLVELTQ 
KPVHQLTOYICGVSrGAILAFMLGLFHMPLDECEELYRKLGSDV 
FSQNVI VGTVKMS WSHAF YDSQTWENILKDRMGSALMIETARNP 
TCPKVAAVSTIVNRGITPKAFVFRNYGHFPGINSHYLGGCQYKM 
WQAIRASSAAPGYFAEYAJ^JTOLHQDGGLLEiNNPSAIiAMHECKC 
LWPDVPLECrVSLGTGRYESDVRNTVTYT 

TEBVHT MLDGLLP PDTYFRFNPVMCENI PLDE SRNE JCL»DQLQLE 
GWCYIERNEQKMKICVAKILSQEKTTLQKINDWIKLKTDMYEGLP 
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corresponding 
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amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alar.ine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P- Phenylalanine, G»Glycine, 
H«Hictidine, I=Isoleucine, K=Lysine, 
LsLeucine, Methionine, N=Asparagine ( 
P=Proline, Q«Glutamine, RoArginine, 
S=Serine, T=Threonine, v=Valine, • 
W=Tryptophan, Y=Tyrosine, XaUnknown, *=Stop 
Codon. /soossible nurlpnh'lHp rfolai-inn 
\=possible nucleotide insertion) 








PFSKL 


6242 

• 


198 


1310 


QHFLPGAETWSPGAAVCTARRFPGRSLAAFPRPAAPRRAVEMGE 
SS ED I DQMFSTLW3BKDLLTQS LGVDTLP PPDPNP PRAEFNYSV 
GFKDLNESLNAIiEDQDLDAW^AnLVADISEAEQRTIQAQKESLQ 
NQHHS ASLQA5 1 FSGAAS LGYGTNVAATG ISQ YEDDL P P P P ADP 
VLDLPLP P PP PE PLSQEEE EAQAKADKI KLALEKLKEAKVKKL V 
VKVHMNDNSTKS LMVDERQLARDVLDNLFBKTHCDCNVDWCLYE 
IYPELQIERFFEDHENWEVLSDWTRDTENKILPLEKEEKYAVF 
KNPQNFYLDNRGKKE3KETNEKNNAKNKESLLBVRLILQSGRKB 
KDVCSIFKSFASENNGKI 


6243 


1509 


614 . 


rsasrfsgcwsrdstcccCpstcwsrs^ascprarwppssapat " 
ts ras s rrlacg pqtragaetrstami raks aardtrratcrsa 
agtps pttmtcltdvptg caave ptarlpaaawasti ttg ccpa 

MGQAGAGPAGRKGSEAGGGPGRAHHAHPSPLPREPRVRTGPPAH 
SPTPGSIDPSPEIjSWGSAGVTQESPLLDPVDFLLFRTRAVDPLR 
RVFFFFYQHLTPPSIQPQPPPCHAFHPRDPPAGTKRQIiILVPLK 

gppilapilsltpilsrwscyfprsriaqgwhls 


6244 


2119 


1745 


fehayasqfgtflgnnesercklklqqktmslwswvnqpselsk"" 
ftnplfbannlviwpsvapqslplwegiflrwnrsskyldeaye 
emvni i eynkelqakvnilrrqlaeletedgmqesp 


6245 


81 


1148 


lslrnakysfpqelislfsmtdlndnickryikmitnivilsli 
icislafwiismtastyygnlrpispwrwlfsvwpvlivsngl 
kkksldhsgalgglwgpiltianfsfftsllmfflssskltkw 
kge vkxrlds e ykeggqrnw vq v pcngavptelallytc i eng pg ' 
eipvdfs kqysaswmclsllaalacsagdtwasevgpvlskssp 
rlittwe kvpvgtnggvtwglvssllggtfvgiayfltqli pv 

OT3LDISAPQWPIIAFGGLAGLLGSIVDSYLOATMQYTGLDESTG 
MWNS PTNKARHIAGKP ILDNNAVNLFSS VLIALLLPTAAWGFW 
PRG 


624* 


1177 


359 


S LW P W I LMDDS LMQI SLQLLCVYTANFPNGCS S LCWSS CGQH P V 
QATHRGAVSNSLMI^ILKLASQMPLENTTVQQMVFMLLSNLALS 
HDCKGVlQKSNFliQNFLSLALPKGGNKHLSNLTILWLKLLLNIS 
SGEDGQQMILRLDGCLDLLTEMSKYKHKSSPLLPLLIFHNVCFS 
PANKPKI LANBKVITVLAACLESENQNAQRIGAAALWALI YNYQ 
KAKTALKSPSVKRRVnEAYSIAKKTFPNSEANPLNAYYIiKCLElJr 
LVQLLNSS . 


*247 


3 


1678 


NSRVWGPWTEPSAGSLRPMARKQNRNSKELGLVPIiTDDTSHAGP 
PGPGRALLECDHLR5 GVPGGRRRXDWS CSIjLVASLAGAFGS S FL 
YGYNLS WNAPTP YI KAFYNES WERRHGRPIDPDTLTLLWSVTV 
SIFAIGGLVGTLIVK>1IGKV1^RKHTLLANNGPAISAALLMACS 
LQAGAFEMLIVGRPIMGIDGGVALSVLPMYLSEISPKEIRGSLG 
QVTAIFICIGVPTGQLLGLPELLGKESTWPYLFGVIWPAWQL 
LSLPFLPDSPRYLLLEKHNEARAVKAFQTFLGKAHVSQEVEEVL 
AESRVQRS IRLVSVLELLRAPYVRWQWTVI VTMACYQLCGLNA 
I WFYTNS I FGKAG I P PAKI P YVTLSTGG I ETLAAVF S GLVI EHL 
GRRPLLIGGFGLMGLFFGTLTITLTLQDHAP5TVPYLS I VGILAI 
IASFCSGPGGIPFILTGEFFQQSQRPAAFIIAGTVNWLSNPAVG 
LLFPFIQKSLDTYCFLVPATICITGAIYLYFVLPETKNRTXAEI 
SGAFSKRNKAYPPEEKIDSAVTDGKINGRP 


624B 


56 


1773 


VPPPRNMAAVPPGLEPWNRVRIPKAGNRSAVTVQNPGAAIjDLCI 
AAVIKECHLVILSLKSQTLDAETDVLCAVLYSNHNRMGRHKPHL 
ALKQVEQCLKRLKNMNLEGS iqdlfelfs snenq plttkvcwp 
S Q P WEL VLMKVLG ACKLLL RLLD CC CKTFLLTV KHliG LQE F 1 1 
LNLVMVGLVSRLWVLYKGVLKRLILLYEPIjFGLLQBVARIQPMP 
YFKDFTFPSDITBFLGQPYPEAPKKKMPIAFAAKGINKLLNKLF 
LINEQSPPJ^EETLLGISKKAKQMKINVQNNVDLGQPVKNKRVF 
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corresponding 
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amino acid 
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amino acid 
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amino acid 
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Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q-Glutamine, R»Arginine, 
S=Serine, T«=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X*Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








KEESSBFDVRAFCNQIiKHKATQETSFDFKCSQSRLKTTKYSSQK 
VIGTPHAKSFVQRFREAESFTQLSEB1QMAWWCRSKKLKAQAI 
FLGNKLLKSNRLKHLEAQGTSLPKKLECIKTSICNHLLRGSGIK 
TSKHHLRQRRSQNKPLRRQRKPQRKLQSTLLREIQQFSQGTRKS 
ATDTSAKWRLSHCTVHRTDLYPNSKQLLNSGVSMPV1QTKEKMI 
HENLRGIHENETDSWTVMQINKNSTSGTIKETDDIDDIFAIiMGV 


6249 


56 


1773 


VPPPRMMAAVPPGLEPWNRVRI PKAGNRSAVTVQNPGAALDLCI "" 
AAVI KECHLVILSLKSQTLDAETDVLCAVLYSNHNRMGRHKPHL 
ALKQVEQCLKRLKNMNLEGS IQDLFELFSSNENQPLTTKVCWP 
SQ P VVE LVLMKVLGACKLLLRLLDCCCKTFLLTVKHLGLQEFI I 
LWLVMVGLVSRLWVLYKGVLKRLILLYBPLFGLLQEVAR X QPMP 
YFKDFTPPSDITEFLGQPYFEAFKKKMPIAFAAKGINKLLNKLF 
LINEQS PRASE BTLLGI SKKAKQMKINVQNNVDLGQPVKNKRVF 
KEESSEFT)VRAFCNQLKHKATQETSFDFKCSQSRLKTTKYSSQK 
VIGTPHAKSFVQRFREAESFTQLSEEIQMAWWCRSKKLKAOAI 
FLGNKLLKSNRLKHLEAQGTSLPKKLECI KTSICNHLLRGSG1K 
TS KHHLRQRRSQNKFLRRQRKPQRKLQSTLLREIQQFSQGTRKS 
ATDTS AKWRLSHCTVHRTDLYPNS KQLLNSGVSMPVI QT KE KM I 
HENLRGIHBNETDS WTVMQINKNSTSGTIKETDDIDDI PALMGV 


6250 


232 


1306 


LAALHIMALPFRKDLEKYKDLDEDELLGNLSETELKQLETVLiDD 
LDPENADLPAGFRQKNQTSKSTTGPFDREHLLSYLEKEALEHKD 
RED YVP YTGEKKGKI F I PKQKPVQTFTEEKVSLDPELEE AL.TSA 
SDTELCDLAAIIXSMHNLIlTnTCFCNIPrcSSlJGVDQEHFSNWKG 
EKILPVFDEPPNPTNVEESLKRTKENDAHLVEVNLNNIKNIPIP 
TLKD FAKALETNTUVKCFS LAATRSNDPVATAFAEMLKVNKTLK 
SLNVESNF ITGVG ILAL IOALRDNETLAELKI DNQRQQLGTAVE 

LEMAKMLEENTNILKFGYQFTQQGPRTRAANAITKNNDLVRKRR 
VBGDHQ 


6251 


62 


972 


TPGSGPMSAWAAASLSRAAARCLIoARGPGVRAAPPRDPRPSHPE " 
PRGCGAAPGRTLHFTAAVPAGHNKWSKVRHI KGPKDVERSRI FS 
KLCLN I RIAVKEGGPWP EHNSNLANI LEVCRS KHMPKSTIETAL 
KMBKS KDTYLLYEGRGPGGSSIiLI EALSN5 SHKCQAD I RH I LNK 
NGG VMAVGARHSFDKKGVIVVEVEDREKKAVNLERALEMAIEAG 
AE DV KE TEDEE ERNVFKF I CDAS S LHQVRKKLDS LGLCS VS CAL 
EFIPNSKVQLAEPDLEQAAHIilQALSNHEDVIHVYDNIE 


<*252 


27 


1897 


EEFCT W IAVRVGEMETAP KPGKD VP PKKDKLQTKRKKPRRYWEE 
ETVPTTAGASPGPPRNKKNRELR PQRPKNAYILKKSRIS KKPQ V 
PKKPRBWKNPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 
KLPHS KAKTRSRLEVAEAEEEETS IKAARSELLLAEEPGFLEGE 
DG ED TAKI CQAD I VEAVD I ASAAKHFDLNLRQFG P YRLNYS RTG 
RHI^GGRRGHVAALDWVTKKLMCBINVMEAVRDIRFLHSEALL 
AVAQNRWLH I YDNQG IELHC IRRCDRVTRL E F L P FHFLLATAS E 
TGFLTYLDVSVGKIVAALNARAGRLDVMSQNPYNAVIHLGHSNG 
TVSLWSPAMKEPliAKILCHRGGVRAVAVDSTGTYMATSGLDHQL 
KIFDLRGTYQPLS TRTLPHGAGHLAFSQRGLLVAGMGDWN1 WA 
bUGKAS PPSLEQP YLTHRLSGPVHGLQFCPFEDVLGVGHTGGI T 
SMLVPGAGEPNFDGLESNPYRSRKQRQEWEVKALLEKVPAELI C 
LDPRALAEVDVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKGR 
SSTASLVTCRKRKV^EEHRDKVRQSLQQQHHKEAKAKPTGARPS 
ALDRFVR 


6253 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKD VP PKKDKLQTKRKKPRRYWEE 
ETVPTTAGASPGPPRNKKNRELRPQRPKNAYILKKSRISKKPQV 
PKKPREWKWPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 
KL PHSKAKTRSRLEVAEAEEEETS I KAARS ELLLAE EPGFLEGE 
DGEDTAKI CQAD I VEAVD IAS AAKHFDLNLRQFGP YRLNYS RTG 
Rffl^FGGRRGHVAALDWTKKLMCEINVMEAVRDIRFT^SEALL 
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Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
F=Proline, Q=Glutamine, R^Arginine, 
SaSerine, T= Threonine, V-Valine. 
W-Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AVAQNRW LH I YDNQG I E LHC I RRCDRVTRLE FLP FH F L LATASE 
TGFLTYLDVS VG KI VAAIW ARAGRLDVMSQ1TP YNAV IHLGHSNG 
TV^LWSPAMKEPLAKILCHRGGVRAVAVDSTGTYMATSGLDHQL 
KI FDLRGTYQ PLSTRTLPHGAGHLAFSQRGLLVAGMGDVVNIWA 
GQGKASPPSLBQPYLTHRLSGPVHGLQFCPFEDVLGVGHTGGIT 
S M LVPGAG EPNFDGLESNPYRS RKQRQE W EV KALLEKVPAELI C 
IiDPRALAEVDVISLEQGKKEQIERLGYDPOAKAPFQPKPKQKGR 
S S TASLVKRKRKVMDBEHRDKVRQSLQQQHH KEAKAKPTGARPS 
ALDRFVR 


6254 


155 


1139 


HAIX3RRGGSQELSAAACGCFALRLRAPGSGRPALAPGAAAFAGL 
GGAPRFPPRGSAAGRTMLLXEYRI CMPLTVDE YKIGQLYM ISKH 
S HEQSDRGEGVE WQNE PFEDPHHGNGQFTEKRVYLNS KLi P SWA 
RAWPKIFYVTEKAWNYYPYTITEYTCSFLPKFSIHIETKYEDN 
KGSNDTIFDNEAKDVEREVCFIDIACDEIPERYYKESEDPKHFK 
S BKTGRGQLREGWRDSHQP 1MCS YKLVTVKFEVWGLQTRVEQFV 
HKVVRDILLIGHRQAFAWVDEWYDMTMDDVREYEKNMHEQTNI K 
VCNQHSSPVDDIESHAQTST 


6255 


1 


1444 


PTRPQQELLVSLATVI FVASQKALSVESKAVIKQQLESVSNGWT 
VYRIARQASRMGNHDMAKELYQSLLTQVASKHFYFWLNSLKEFS 
HAEQ CLTGLQ EENYSSALS CIAES LKFYHKGIASLTAAST PLNP 
I*S FQCEFVKLRI DLIK>AFSQLI CTCNSLKTS PP PAIATTI AMTIi 
GNDLQRCGRI SNQMKQSMEEFRSLASRYGDLYQAS FDADSATLR 
NVELQQQSCIiLISHAI E ALI LDPBSASFQEYGS TGTAHADS EYE 
RRMMSVYNHVLEFA7ESLNGKYTPVSYMHTACLCNAIIALLKVPL 
SFQRYFFQKLQSTSIKLALSPSPRNPAEPIAVQNNQQLALKVEG 
WQHGS KPGLFRKIQS VCLNVSSTLQS KSGQDYKI PIDNMTNEM 
EQR VE PHNDYFSTQFLLNPAI LGTHN I TVES S VKDANG I VWKTG 
PRTTIFVKSLEDPYSQQIRLQQQQAQQPLQQQQQRNAYTRF 


6256 


1 


1542 


CRGAGAEPAANPRSPRSLVPSLESTSTSVPPAPGTMATDSWALA 
VDEQEAAAESLSNLHLKEEKIKPDTNGAWKTNANAEKTDEEEK 
EDRAAQSIiLNKLX RSNL VDNTNQVE VLQRDPNS PLYS VKS FEEL 
RLKPQLLO^VYAMGFNRPSKIQENALPLMLAEPPQKLIAQSQSG 
TGKTAAFVLAMLSQ VE PANKYPQCLCLS PTYELALQTGKVI EQM 
GKFYPELKLAYAVRGNKLERGQKISEQIVIGTPGTVLDWCSKLK 
FI DPKKI KVFVLDEADVM I ATQGHQDQS IRI QRML PRNCQMLLF 
3ATFEDSVWKFAQKWPDPNV1KLKREBETLDTIKQYYVLCSSR 
DEKFQALCNLYGAIT IAQAM I FCHTRKTAS WLAAELS KEGHQVA 
Ll^GEMM VEQRAAVTERFREGKEKVLVTTNVCARG I DVEQVS VV 
INFDLPVDKDGNPDNETYLHRIGRTGRFGKRGLAVNMVDSKHSM 
NILNR IQEHFNKKIERLDTDDLDEIEKIAN 




6257 


210 


615 


afipamaeliqkklqgevekyqqlqkdlSksmsqrqkleaqlte 

NNIVKEEIJ^LDGSNVVFKIiIX3PVLVKQELGEARATVGKRLDYI 
TAEIKRYESQIiRDLERQSEQQRETLAQLQQEFQRAQAAKAGAPG 
KA 




6258 


210 


615 


AFIPAMAELIQKKLQGEVBKYQQLQKDi*SKSMSGRQKLEAQLTE ' 

TAEI KRYESQLRDLERQSEQCRETLAQLQQEFQRAOAAKAGAPG 
KA 




6259 


2 


1540 


I LEKG FPSQCHPERKWKVDDVLESSQENEDDHFWELLFHNN KTV" 
S VENGDRGS KTFNLGTDP VSLRNYPYKI CDS CEMNL KNISGL 1 1 
S KKNCSRJCKPDE FNVCEKLLLD I RHE KI P IGEKS YKYDQKRNAI 
NYHQDLSQPSFGQSFEYSKNGQGFHDEAAFFTNKRSQIGETVCK 
YNECGRTFIESLKLWISQRPHLEMEPYGCSICGKSFCMNLRFGH 
QRALT KDNP YEYNE YGE I FCDNS AF 1 1 HQGAYTRKI LR E YKVSD 
KTWEKSAIiLKHQIVHMGGKSYDYNENGSNFSKKSHLTQLRRAHT 
GEICTFECGECGKTFWEKSNLTQHQRTHTGEKPYECTECGKAFCQ 
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Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N«Asparagine , 
P»Proline, Q=Glutamine, R=Arginine, 
S= Serine, TVThreonine , V=Valine, 
W=Tryptophan, Y=Tyroaine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPHLTKHQRTHTGEKPYECKQCX5KTPCVKSNI,TEHQRTHTGEKP 
YE CNACGKS FCHRSALTVHQRTHTGEKPF I CNECGKS FCVXSNL 
I VHQRTHTGE KP YKCNBCGKT FCBKSALTKHQRTHTGE K P YECN 
ACGKTF5QRSVLTKHQRIHTRVKALSTS 


6260 


2081 


1436 


GTGPE IHACAHASARAPGSRAMALRELKVCLLGDTGVGKS SI VW 
RFVEDSFDPNINPTIGASFMTKTVQYQNELHKFLIWDTAGQERF 
RALAPMYYRGSAAAI I VYDITKEETFSTLKNWVKELRQHG PPNI 
WAI AGNKCDL IDVRB VMERDAKD YADS ZHAI FVETSAKNAINI 
NELFIEISRRIPSTDANLPSGGKGFKLRRQPSEPKRSCC 


6261 


3 


1188 


FWYRLGPGTRSRWPRKGSWAASLVPRGPSPAALVTSPCPPDPLR 
SPACEPCRPDFAPRPALIiLRSGPRSAPAVTGKPALKGQPG PWPG 
MAEVS IDQSKLPGVKEVCRDFAVLEDHTLAHSLQEQE IEHHLAS 
NVQRNRLVQHDLQVAKQLQEEDLKAQAQLQKRYKDIjEQQDCEIA 
QB IQEKLA I EAERRR I QE KKDEDI ARLLQEKELQEEKKRKKHFP 
EFPATRAYADSYYYEDGGMKPRVMKEAVSTPSRMAHRDQEWYDA 
E I ARKLQEEE LLATQVDMRAAQ VAQDE E IARLLMAEE KKAYKKA 
KEREKSSLDKRKQDPBWKPKTAKAANSKSKESDEPHHSKNERPA 
RPPPPIMTDGEDADYTHFTNQQSSTRHFSKSESSHKGFHYKH 


6262 


2 


1759 


PBCHSQGLCSVHRPGKVPQARMSGLVLGQRDEPAGHRLSQEEI L 
GSTRIiVSQGLEALRSBHQAVLQSIiSQTIECIiQQGGHEEGLVHEK 
AROLRRSMEMIELGLSlglvnVMTAT.IigHT.QTX/TgqTrifnvT.p^rrtmp 

LCQENQWIJIDELAGTQQRI>QRSEQAVAQLEEEICKHLEFLGQLRQ 
YDEDGHTSBBKEGnATKDSLDDLFPNEEEEDPSNGLSRGQGATA 
AQQGGYEIPARLRTLHNLVIQYAAQGRYEVAVPLCKQALEDLBR 
TSGRGHPDVATMIiNILALVYRDQNKYKEAAHLIiNbAIiS IRESTL 
G PDH P AVAATLNNLAVL YGKRGKYKEAE PLCQ RALE IRE KVLGT 
NH P D VAKQLNNLALLCQNQG KYEAVER YYQRAIiA I YEGQLGPDN 
PNVARTKNNLASCYLKQGKYABABTLYKEI LTRAHVQEFGSVDD 
DHKP I WMHAEEREEMSKSRHHEGGTPYAEYGGW YKACKVS SPTV 
NTTIJWIX3ALYRRQGKLEAAETLBECALRSRRQGTDPISQTKVA 
ELLG E SDGRRTS QEGPGDS VKFEGGEDASVAVEWSGDG SGTLQR 
SGSLGKIRDVLRR 


" 6263 


1 


2408 


RELDSLADt,PERI KPP YANGLSTSHLRSSS VEDVKLI ISEGRPT 
IEVRRCSMPSVICEHTKQFQTISEBSNQGSLIjTVPGDTSPSPKP 
EVFSNVPERDLSNVSNIHSSEATSPTGASNSKYYSADRNL I KNT 
APVNTVMDS PVHLEPS SQ VGVIQNKS WEMPVDRLETLSTRD F I C 
PNSN I PDQESSLQS FCNS ENKVLKENADFLSIiRQTELPGNS CAQ 
DPASFMPPQQPCS FPSQSLSDAESISKHMSLSYVANQEPG I LQQ 
KKAVQ I ISSALDTDNESTKDTElfrFVLGDVQKTDAFVP VYS DS T 
IQBAS PNFEKAYTIiPVLPSEKDFNGSDASTQLNTHYAFS KLTYK 
SSSGHEVENSTTDTQVISHEKENKLESLVLTHLSRCDSDLCEI4N 
AGMPKGNLNBQDPKHCPESEKCIiIiSIEDEESQQSILSSLENHSQ 
QSTQPEMHKYGQLVKVELBENABDDKTENQI PQRMTRNKANTMA 
NQSKQILASCTLLSEKDSESSSPRGRIRLTEDDDPQIHHPRKRK 
VSRVPQP VQVS PSLLQAKEKTQQSLAAIVDSLKLDEIQPYS S ER 
ANP YFE YLH IR KKI EEKRKLLCS V I PQAPOYYDEYVTFNGS YLL 
DGNPLSKICIPTITPPPSLSDPLKELFRQQEVVRMKliRLQHSIE 
REKLIVSNEQEVLRVHYRAARTLANQTLPFSACTVLLDAEVYNV 
PLDSQSDDSKTSVRDRFNARQFMSWLQDVDDKFDKLKTCLLMRQ 
QHEAAALNAVQRLEWQLKLQELDPATY KS IS I YE IQEFYVPLVD 
VNDDFBLTPI 


6264 


143 


I960 


KHRQEWNA^MAPEIH^GPMCLIENTNGELVANPEALKILSAI 
TQPVVVVAIVGLYRTGKS YLMNKLAGKNKGFSLGSTVKSHTKG I 
HMWCVPHPK3CPEHTLVLLDTEGU3DVKKGDNQNDSWIFTLAVLL 
SSTLVYNSMGTINQQAMDQLYYVTELTHRIRSKSSPDENENEDS 
ADFVS FFPDFVWTZiRDFS LDLEADGQPLTPDEYLEYSLKLTQGT 
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locat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=beucine, M=Methionine, NnAsparagine, 
Pa proline, Q»Glut amine, R=Arginine, 
S«Serine, TVThreonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQKDKNFNLPRLCIRKFFPKKKCFVFDLPIHRRKIAQLEKLQDE 
ELDPEFVQQVADFCS YI FSNSKTKTLSGGI KVNGPRLESLVLTY 
I WAX S RGD L P CMENAVLALAQ I EN S AAVQ KAIAHYDQQMG QKVQ 
LPAETLQELLDLHRVSEREATEVYMKNS FKDVDHLFQKKLAAQL 
DKKRDDFCKQNQEASSDRCSAIiLQVIFSPLEEEVKAGIYSKPGG 
YCLFIQKLQDLEKKYYEEPRKGIQAEEILQTYTjKSKESVTDAIL 
QTDQ ILTEKEKE I E VECVKAES AQASAKMVEEMQI KYQQMMEE K 
EKSYQBHVJCQLTEKMERERAQLLEBQEKTLTSKLQBQARVLKER 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


6^5" 


143 


1960 


KHRQENNALDMAPEIH^TGPMCLIENTNGELVANPEALKILSAI 
TQPVVVVAIVGLYRTGKSYLMNKLAGKNKGFSIX3STVKSHTKGI 
WMWCVPHP JOCPEHTLVLLDTEGLGDVKKGDNQNDSWI FTCiAVLL 
S S TLVYNS MGTINQQAMDQLYYVTBLTHR I RS KS S P DENENEDS 
ADFVS FFPDFVWTLRDFSLDLEADGQPLTPDE YLEYSLKLTQGT 
SQKDKNFNLPRLCIRKFFPKKKCFVFDLPIHRRKLAQLEKLQDE 
ELDPEFVQQVADFCS YI FSNSKTKTLSGGI KVNGPRLESLVLTY 
IHAI SRGDL P CMENAVLALAQ I ENS AAVQKALAHYDQQMGQKVQ 
L P AETLQELLDLHR VS EREATEVYMKNS FKDVDHLFQKKLAAQL 
DKKR DD FCKQNQEAS SDRCS ALLQVI FS PLEEEVKAGI YS KPGG 
YCLFIQKI^DLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 
QTDQILTEKEKEIEVECVKAESAQASAKMVEEMQIKYQQMMEEK 
EKS YQEHVKQLTE KMERE RAQLLEBQBKTLTS KLQEQARVLKER 
COGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


£266 


276 


1421 


GSHQKQMLVPCFLYSLQNRiCPSLYGSLTCQGIGLD6 1 PEVTASE 
G FT VNE I NKKS IH I S CP KENAS S KFLAP YTTFSRIHTKS I TCLD 
I S S RGGLG VSS STDGTMKI WQASNGELRRVLEGHVFDVNCCRFF 
PSGLWLSGGMDAQLKI WSAEDASCWTFKGHKGGI LDTAIVDR 
GRNWS AS RD GTARL WD CGRSACLGVLADCGS S XNGVAVGAADN 
S INLGSPEQMPSEREVGTEAKMLLLAREDKKLQCLGLQSRaLVF 
LFIGSDAFNCCTFLSGFLLLAGTQDGNIYQLDVRSPRAPVQVIH 
RSGA PVLSLLS VRDGFIASQGDGS CFrVQQDLD YVTEXTGADCD 
P VYKVATWEKQ I YTCCRDGLVRRYQLSDL 




3 


622 


lgmmkknnsakrgpqdgnqqpappekvgmvrkfcgRgifreiwk 

NRYWLKGDQLYI3EKEVKDEKNIQEVFDLSDYEKCEELRKSKS 
RSKKNHSKFTLAHSKQPGNTAPNLIFLAVSPEEKESWINAIiNSA 
ITRAKNRIlJDErTWEEDSYIiAHPTRDRAKIQHSRRPPTRGHLMA 
VAS TS TSDGMLTLDL IQEEDPS PEEPTSLC 


6268 


160 


136-8 


HRELCQNLPAGLSSALIDNPLTLLLSIDTYVMLQEPVTFQDVAV 
DFSREEWGLLGPTQRTEYRDVMLETFGHLVSVGWETTLENKELA 
PNSD I PEEEPAPSLKVQBSSRDCALSSTLEDTLQGGVQEVQDTV 
LKQMESAQBKDLPQKKHFDNRBSQANSGALDTNQVSLQKIDNPE 
SQANSGALDTNQVLLHKIPPRKRLRKRDSQVKSMKHNSRVKIHQ 
KSCERQKAKEGNGCRKTFSRSTKQITFIRIHKGSQVCRCSECGK 
I FRNPRYFSVHKKIHTGERPYVCQDCGKGFVQS SSLTQHQRVHS 
GER P FECQECGRTFNDRSAISQHLRTHTGAKP YKCQDCG KAFRQ 
SSHL IRHQRTHTGERPYACNKCGKAFTQSSHL IGHQRTHNRTKR 
KKKQPTS 


6269 


28B6 


1449 


HASAFTRRNMAAASPLRDCHAWKDARLPLSTTSNEACKLFDATL ' " 
TQYVKWTNDKSLGGIEGCLSKLKAADPTFVMGHAMATGLVIilGT 
GSSVKLDKBLDLAVKTMVE I SRTQ PLTRREQLH VSAVET FANGN 
FPKACELWEQILQDHPTDMLALKFSHDAYFYLGYQEQMRDSVAR 
I YP FWTPDI PLSS YVKGI YS FGLME TN FYDQAEKLAKBALS INP 
TDAWS VOTVAHIHEMKAE I KDGLEFMQHSETLWKDSDMLACHNY 
WHWALYLIEKGEYBAALTIYDTHILPSLQANDAMLDVVDSCSML 
YRLQMEGVSVGQRWQDVLPVARKHSRDHILLFNDAHFLMASLGA 
HD PQTTQBLLTTLRDAS ES PG ENCQHLLARDVGLPLCQALVEAE 
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SEQ 
ID 
NO: 


Predicted 

•jjts y j.i hi j. iiy 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, KeLysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=*Arginine, 
SsSerine, TsThreonine, Vs Valine, 
WaTryptophan, Y=Tyrosine, X»Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








DGNPDRVLELLLP I RYRI VQLGGSNAQRDVFNQLL IHAALNCTS 
S VHKNVARSLLMERDALKPNS PLTERL I RKAATVHLMQ 


6270 


23 


2086 


S VT\n*LGSEGIX3RPPTYHLBEMEQEPQNSEPAElKI IREAYKKA 
FLFVNKGLNTDELGQKEEAKNYYKQGIGHLLRGISISSKESEHT 
GPGWESARQMQQKMKETLQNVRTRLE ILEKGLATS LQNDLQEVP 
KLYPE FP PKDMCE KLPE PQS FS SAPQHAE VNGNTS TPS AGAVAA 
PASLS LPS QSCPABAP PAYT PQAAEGHYTVS YGTDSGE FSS VGE 
E FYRNHS QP PPLE TLGLDADBL I L I PNGVQ I FFVNPAGEVS APS 
YPG YLRI VRFLDNSIiDTVLNRPPGFLQVCDWI* Y PL VPDRS P VLK 
CTAGAYMF PDTMI^AAGCFVGVVLSS EL PEDDRELFEDLLRQMS 
DLRLQANWNRAEEENEFQIPGRTRPSSDQLKEASGTDVKQLDQG 
KKDVRHKGKRGKRAKDTSSEBVNLSHIVPCEPVPEEKPKELPBW 
S E KVAHN ILSGAS WVS WGLVKGAE ITGKA IQKGAS KL RERIQP E 
EKP VEVS PAVTKGLY I AKQATGGAAKVSQFLVDGVCTVANCVG K 
ELAPHVKKHGSKLVPBSLKKDKDGKSPLDGAMWAASSVQGFST 
VWQGLECAAKC I VNNVSABTVQT VR YK YG YNAGEATHHAVDSAV 
NVG VTAYNINN IG I KAM VKKTATQTGHTLLEDYQI VDNSQRENQ 
EGAANVNVRGEKDEQTKEVKEAKKKDK "* 1 


6271 


32 


10*8 


GCGVKTAGMVGREKELS IHFVPG S CRLVEEEVN I PNRR VLVTGA 
TGIJbGRAVHKEFQQNNWHAVGCGFRRARPKFEQVNLLDSNAVHH 
I IHDFQPHV1VHCAAERRPDVVENQPDAASQLNVDASGNLAKEA 
AAVGAFLI Y I S SDYVFDGTNPPYREEDIPAPLNLYGKT KLJX3EK 
AVLENNLGAAVLRI PI LYGEVEKLEES AVTVM FDKVQFSNKSAN 
MDHWQQRFPTHVKDVATVCRQLAEKRMLDPSI KGTFHWSGNEQM 
TKYEMACAIAD AFNLPS SHLRP I TDS P VLGAQRPRNAQLDCS KL 
ETLG IGQRTPFRIGIKESLWPFLIDKRWRQTVFH 


S272 


il3$ 


528 


GAVMEnAAAPGRTEGVLERQGAPPAAGG^GALVELTPTPGGIAL 
VSPYHIHRAGDPLDLVALAEQVQKADEFIRANATNKLTVIAEQI 
QHI^EQARKVLEDAHRDANLHHVACNIVKKPGNIYYLYKRBSGQ 
QYFSIISPKEWGTSCPHDFLGAYKLQHDLSWTPYEDIEKQDAKI 
SMMDTLLSQSVALPPCTEPNFQGLTH 


6273 


256 


wd 


SCPRVSPECRSIX5CQVMFSLPLNCSPDHIRRGSCWGRPQDLXIA 
SAA WNSKCHPGAGAAMARQHARTLW YDRPRYVFME FCVEDS TDV 
HVLI EOHR I VFS CKNADG VE L YNE I E F YAKVNS KDS QPKRS SR5 
ITCFVRKWKEKVAW PRLTKEDI KPVWLS VDFDNWRDWEGDEEME 
LAHVEHYAEVRDNTYCVLPT 


6274 


56 


1142 


AAAAMAAAAGGGAGAARSLSRFRGCIJ^GALLGDCVGSFYEAHDT 
VDLTSVLRHVQSLEPDPGTPGSERTEALYYTDDTAMARALVQSL 
LAKEAFDE VDMAHRFAQEYKKDPDRG YGAGVVTVFKKLLNP KCR 
DVFE PARAQFNGKGS YGNGGAMRVAG I SLAYS SVQDVQKFARLS 
AQLTHASSLGYNGAILQALAVHLAIXJGESSSKHFLKQLLGHMED 
LEG DAQS VLDARE LGMEERP YS S RLKKI GELLDQAS VTREE WS 
ELGNGIAAFESVPTAIYCFLRCMEPDPEIPSAFNSLQRTLIYSI 
SLGGDTDT I ATMAGA IAGAY YGMDQVPES WQQS CEG Y EETD ILA 
QSLHRVFQKS 


6275 


20 


565 


SRRGRARCLARGSRRPVPRPAlfTMA pmu vtm i'jrih r\ r vxtt nv»r.T~y; — 

GGEDKGDGDKSAAEAQGMSREEYEBYQKQLVEEKMERDAQFTQR 

KAERATliRSHFRDKYRLPKNETDESQIQMAGGDVELPRELAKMI 

BBT>TEEEEEKASVLGQLASLPGLN1U3SLKDKAQATLGDLKQSAE 
KCHVM 


6276 " 


797 


97 


TLLPLPPLPUTEGMILLNTGLEGTVAENPVPIVHTPSGWILTLB 
SCI^QLATHPGHWGIHLQIAEPAALRPSLALLARLSSLGLLHWP 
VWVGAKISHGSFSVPGHVAGRELLTAVAEVFPHVTVAPGWPEEV 
IjGSGYREQLLTDMLELCG^SLWQPVSFQMQAMLLGHSTAGAIGRL 
LASS PRATVTVEHNPAGGD YAS VRTALLAARAVDRTRVY YRLPQ 
3YHKDLLAHVGRN 
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SEQ 
ID 

wroi 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anu.no acid segment containing signal pepbide 
(A«Alanine, C-Cysteine, D«Aspartic Acid, E*s 
Glutamic Acid, F« Phenyl alanine, Q=Glyciae. 
HaHistidine, I=*Isoleucine, K*Lysine, 
L=Leucine, M*Methionine, N=Asparagine , 
P^Proline, 0=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *-SCop 
Codon, /=possible nucleotide deletion, 
\opossible nucleotide insertion) 


6277 


4600 


2744 


MAPRTE^X5LyYSyPKTlVBAPSFLNGVWMIMNDKLTEYPLVINT " 
LKRFNLYPBVILASWYRIYTKIMDLIGIQTKICWTVTIGEGLSP 
TESCEGLGDPACETVAVIFILNGIiMMALPFIYGTYtjSGSRLGGL 
VTVLCFFFNHGECTRVMWTP PLRES FS YPFLVLQMLLVTHI LRA 
TKL YRGSLIALC ISNVFFML PWQFAQ FVLLTQ IASI*FAVYWQY 
IDICKLRXIIYIHMISLALCFVLMFGKTSMLLTSYYASSLVI IWG 
IUU^HFLKINVSEI^SLWVIQGCFWLFGTVILKYLTSKIFGIA 
NDAHIGNIiLTSKFFSYKDFDTLLYTCAAEFDFMEKSTPLRYTKT 
LLLPWLVGFVAIVRKI ISDMWGVLAXQQTOVRKHQFDHGELVY 
HALQLLAYTALG I LIMRLKL FLTPHMCVMASL1 CSRQLFGWLFC 
KVHPGAI VFAIIiAAMSIQGSANLQTQWNIVGEFSNLPQEEL IEW 
IKYSTKPDAVFAGAMPTMASVKLSALRPIVNHPHYEDAGLRART 
KIVYSMYSRKAAE EVKRE L I XLKVNYY I LEES WCVRRSKPG CSM 
PEIWDVEDPANAGKTPLCNLLVKDSKPHFTTVFQNSVYKVIiEVV 
KB 


■62*78 
— An'ti — 


3 


823 


I LFRLVLLSLVYLLNSVATE ERKPAE VLI VEGQQYAWGT VLLL 
IRIILEYCQGVDNIPSVTTDMLTRLSDLLKYFNSRSCQIiVLGAG 
ALQWGLKTITTKNLALSSRCLQLIVHYIPVIRAHFEARLPPKO 
YSMLRHFDHITKDYHDHIAE I SAKLVAIMDSLFDKLLSKYEVKA 
PVPSACFRNICKQMTECMHEAIFDLLPEEQTQMLFLRINASYKLH 
LKKQLSHLNVlNDGGPQNGLVTADVAFYTGNliQALKGLKDIiDLN 
MAEIWEQXR 






1687 


GGAMAS DGARKQ FWKRSNS KL PGS IQHVYGAQHPP FDPLLHGTL ~ 
LRSTAKMPTTPVKAKRVSTFQEFESNTSDAWDAGEDDDELI»AMA 
AESLNSEWMETANRVLRNHSQRQGRPTLQEGPGLQQKPRPEAE 
PPSPPSGDLRLVKSVSESHTSCPAESASDAAPLQRSQSLPHSAT 
VTLGGTSDPSTLSSSALSEREASRLDKFKQLLAGPNTDLEEIiRR 
LSWSGI PKPVRPMTWKLLSGYLPANVDRRPATLQRKQK3YFAFI 
EHYYDSRNDEVHQDTYRQIHIDIPRMSPEALILQPKVTE1 FERI 
LFIWAIRHPASGYVQGINDLVTPFFWFICEYIEAEEVDTVDVS 
GVPAE VLCNI EADTYWCMS KLLDG I QDNYTFAQPG I QMKVKMLE 
ELVSR I DEQVHRHLDQHE VR YLQFAFRWMNNLLMREVPLRCT I R 
LWDTYQS EPDGFSH FHLYVCAAFLVRWRKE ILEEKD FQE LLLFL 
QNLPTAHWDDEDIS LLLAEAYRLKFAFADAPNHYKK 


6280 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDP^EGLPRRGAGLRRSEEEEEEDE"" 
DVD LAQ VLA YLLRRG QVRL VQGG GAANLQ F I Q ALLDS E EEND RA 
W DG R LGDRYNP PVDATPDTRELE FNEI KTQVE LATGQLGLRRAA 
QKI1SFTRMLHQRERGLCHRGSFSLGEQSRVISHFLPNDLGFTDS 
YS QKAFCG IYS KDGQ I FMSACQDQT I RLYDCRYGRFRKFKS I KA 
RD VG WS VLD VAFTPDGNH FLYSS WS D YIH I CN I YGEGDTHTALD 
LRPDERRFAVFS IAVS S DGREVLGGANDGCLYVFDREQNRRTLQ 
IESHEDDVNAVAFADISSQILFSGGDDAICKVWDRRTMREDDPK 
PVGALAGHQDGITFIDSKGDARYLISNSKDQTiKLWDIRRFSSR 
EGMEASRQAATQQNWDYRWQQVPKXAWRKLKLPGDSSLMTYRGH 
GVLHTLIRCRFS PIHSTGQQFI YSGCSTGKVWYDLLSGHIVKK 
LT^IHKACVRDVSWHPFEEKEVSSSW1X5NLR1WQYRQAEYFQDDM 
PBSEECASAPAPVPQSSTPFSSPQ 


62B1 


B57 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSE^BEEEDE 
DVDLAQVLAYLLRRGQVRLVQGGGAAITLQFIQALLDSEEENDRA 
WDG R LGDRYNP P VDATPDTRE LEFNE I KTQVE LATG QLGLRRAA 
QKHS FPRMLHQRERGLCHRGS FSLGEQSRVISHFLPNDLG FTDS 
YSQKAFCGIYS1CDGQIFMSACQDQT I RLYDCRYGRFRKFKS I KA 
RDVGWSVLDVAFTPDGNH FLYSS WSDY IHI CN I YGEGDTHTALD 
LRPDERRFAVFS IAVSS DGREVLGGANDGCLYVFDREQNRRTLQ 
IESHEDDVNAVAFADIS S Q I LFSGGDDAICKVWDRRTMREDD PK 
PVGALAGHQDG ITFI DSKGDARYLISNSKDQTI KLWDIRRFS SR 
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SEQ 
2D 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre ^tmndincr 

-i» x coywuu ^nj 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=*Aspartic Acid, E=» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
fl-aiBtioine, laisoieucine, KaLysine, 
Leucine, M=Methionine, NsAsparagine , 
P=Proline, Q=Glut amine, R=»Arginine, 
S=Serine, T=Threonine , V«Valine, 
W*Tryptophan, YoTyrosine, X«=Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\opoaaiDxe nucieocioe insertion) 








EGMEASRQAATQQNWD YRWQQ V P KKAWRKLKL PGDSS LMTYRGH 
GVLHTLIRCRPSP IHSTGQQFI YSGCSTGKVWYDLI>SGHI VKK 
LTNHKACVRDVS WHP FEE KI VS SS WDGNLRL WQ YRQAE YFQDDM 
PESEECASAPAPVPQS STPFSSPQ 


6282 


125 


906 


RMAACRALKAVLVDLSGTLHIEDAAVPGAQEALKRLRGASVI IR 
FVTNTTKESKQDLLERLRKLEPDISEDEIFTSLTAARSLLERKQ 
VRPMUiVDDRALPDFKG IQTSD PNAWMGLAP EHFH YQ I LNQAF 
RLLLIX3APLIAXHKARYYKRKDGLALGPGPFVTALEYATDTKAT 
WGKPEKTFFLEALRGTGCEPEEAVMIGDDCRDDVGGAQDVGML 
GILVKTGKYRASDEEKIWPPPYLTCESFPHAVDHILQHLIj 


6282 


140 


1043 


LSLFGIHVMNPFWSMSTSSVRKRSEGEEKTLTGDVKTSPPRTAP 
KKQLPS IPKNALPITKPTSPAPAAQSTNGTHASYGPFYLE YSLL 
AEFTLWKQKLPG VYVQP5YRSALMWFGVI PIRHGLYQDGVFKF 
TVYIPDNYPDGDCPRLVFDIPVFHPLVDPTSGEIiDVKRAFAKWR 
RNHNHI WQVLMYARRVFYKI DTAS P LNPEAAVL YEXD 1QLFKS K 
WDSVKVCTARLFDQPKI EDPYAIS FSPWNPSVHDEAREKMLTQ 
KKKPEEQHNKSVHVAGLSWVKPGSVQPFSKEEKTVAT 


6284 


1 


2879 

- 


RSVIPGSTISSRWPGLSRPRFMAAHEWDWFQREELIGQISDIRV 
QNLQVERENVQKRTFTRW I NLHLEKCNPPLE VKDL F VD I ODG KI 
LMALLBVLSGRNIiLHEYKSSSHRIFRIiNNIAKALKFLEDSNVKL 
VSIDAAEIADGNPSLVLGLIWNIILFFQIKELTGNLSRNSPSSS 
LAPGSGGTDSDSS FPPTPTAERS VAI SVKDQRKAI KALLAWVQR 
KTRKYGVAVQDFAGSWRSGLAFLAVIKAIDPSLVDMKQALENST 
RENLEKAFS IAQDALHI PRIiLEPEDIMVDTPDEQS IMTYVAQFL 
ERFPELEAEDI FDSDKEVP I EST FVR I KETPSEQESKVFVLTEN 
GERTYTVNHETSHPPPSKVFVCDKPESMKEFRLDGVSSHAIiSDS 
STEFMHQIIDQVLQGGPGKTSDISEPSPESSILSSRKENGRSNS 
LPIKKTVHFEADTYKDPFCSKNLSIjCFEGSPRVAXESLRQDGHV 
LAVEVAEEKEQKQESSKIPESSSDKVAGDIFLVEGTNNNSQSSS 

CNGALE starhde es hs ls P pgentvmadsfqi KVNLMTVEALE 

EGDYFEAI PL KAS KFNS DL I DFASTSQAFNKVPS PHETKPDEDA 
EAFENHAEKLGKRSXKSAHKKKDSPEPQVKMDKHEPHQDSGEEA 
EGCPSAPEETPVDKKPEVHEKAKRKSTRPHYEEEGEDDDLQGVG 
EELSSSPPSSCVSLETLGSHSEEGLDFKPSPPL5 KVSVI PHDLF 
YFPHYEVPLAAVLEAYVBDPEDLKNEEMDLEBPEGYMPDLDSRE 
EEADGSQSSSSSSVPGESLPSASDQVLYLSRGGVGTTPASEPAP 
LAPHEDHQQRETKENDPMDSHQSOESPNLENIANPLEENVTKES 
ISSKKKEKRKHVDHVBSSLFVAPGS VQSSDDLEEDSSDYS I PSR 
TSHSDSS IYLRRHTHRSSESDHFSLCSVEERSRSG 


6285 


2157 


1331 


SCKTENLLEMWWFQQGLSFLPSALVIWTSAAFIFSYITAVTT»HII 
IDPALPY I SDTGTVAPEKCLFGAMLNI AAVLCIATI YVRYKQVK 
AL SPEENVT IKLNKAGL VLG ILS CLGLS I VANFQKTTLFAAHVS 
GAVLTFGMGSLYMFVQTILSYQMQPKI HGKQVFWIRLLZiVI WCG 
VSAIiSMLTCSSVIBSGNFGTDIiEQKLHTOTPEDKGYVLHMITTAA 
EWSMS FS FFGFFLTY I RDFQKI SLRVEANLHGLTL YDTAPCP IN 
NERTRLLSRDI 


6286 


1619 


276" 


KAGASCCGSANPYVSVGKSCVLLAMAQLQTRFYTDNKKYAVDDV 
PFSIPAASEIADIiSNirNKLIiKDKNEFHKHVEFDFLIKGQFLRM 
PLDKHMEMENISSBEWEIEYVBKYTAPQPEQCMFHDDWISS I K 
GAEEWILTGS YDKTSRIWSLEGKS IMTI VGHTD WKDVAWVKKD 
SLSCLLLSASMDQTILLWEWNVERNKVKALHCCRGHAGSVDS IA 
VDGSGTKFCSGSWDKMLKIWSTVPTDEEDEMEESTNRPRKKQKT 
EQLGLTRTPIVTLSGHMEAVSSVLWSDAEEICSASWDHTrRVWD 
VESGSLKSTLTGNKVFNCISYSPLCKRLASGSTDRHIRLWDPRT 
KDGSLVS LS LTSHTGWVTS VKWS PTHEQQLI S GSLDNI VKLWDT 
RSCKAPLYDLAAHEDKVTiSVDWTDTGLLLSGGADNKLYSYRYSP 
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SEQ 
ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine» K^Lysine, 
L=Leucine, M=Methionine, Nt=Aaparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=»Threonine, V»Valine, 
W-Tryptophan, Y»Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 


6287 


278 


1482 


MQFFFNFQlGUlSTSGKEKYSGDAGPtGDALQLPLQCLALDEDF 
APAKLQVQKILCDLLLPENLKEGLKESSWSSLPCTKNRPPDFHS 
VMEESQSLNEPSPKQSEEIPEVTSEPVKGSLNRAQSAQSINSTE 
MPAREDCLKRVSSEPVLSVQEKGVLLKRKLSLLEQDVIVNEDGR 
NKL KKQGETPNEVCMFS LAYGD I PEEL I DVS DFE C S LCMR LFFE 
PVTTPCGHSFCKNCLERCLDHAPYCPLCKESLKEYLADRRycVT 
OLLEELIVKYLPDELSERKKIYDSETAELSHLTKNVPIFVCTMA 
Y PTVPCPLH VFE PR YRLM I RRS I QTGTKQFGMCVSDTQNS FADY 

GCML.QIRNVHFLPDGRSWDTVGGKRFRVLKRGMKDGYCTADIE 
YLEDV 


6288 


1 


743 


VTLYPCRGLVGNLLLGASGMASGCKIGPS I LNSDLANLGAECLR I 
MLDSGADYLHLDVMDGHFVPNITFGHPWESLRKQLGQDPFFDM 
HMMVSKPEQWVKPMAVAGANQYTFHLEATENPGALIKDIRENGM • 
KVGLAIKPGTSVEYLAPWANQIDMALVMTVEPGFGGQKFMEDMM 
PKVHWLRTQFPSLDIEVBGGVGPDTVHKCAEAGANMIVSGSAIM 
RSEDPRSVINLLRNVCSEAAQKRSLDR | 


6289 


1 


743 


VTIiYPCRGLVGNLLLGASGMASGCKI GPS I LNSDLANLGAECLR 1 
MLDS GAD YLHLD VMDGH FVPNI TFGHP WES LR KQLGQDP FFDM 
HMMVS KPEQW VKPMAVAGANQ YTFHLEATENPGALI KD I R RNGM 
KVGLAIKPGTS VEYLAPWANQIDMALVMTVE PG FGGQKFMEDMM 

PKVHWLRTQFPSLDIEVDGOTGPDTVHKCAEAGANMIVSGSAIM 
RSEDPRSVINLLRNVCSEAAQKRSLDR 


6290 


3 


185* 


TUJRWI^VYETVAPTIACLPRPRLRRRRRRRRRRMISRYTRKA| 
VPQSLELKGITKHALNHHPPPEKLEEISPTSDSHEKDTSSQSKS 
DITRESSFTSADTGNSLSAFPSYTGAGISTEGSSDFSWGYGELD 
QNATEKVQ^WFTAIDELLYEQKLS VHTKS LQEECQQWTAS FPHL 
RILGRQIITPSEGYRLYPRSPSAVSASYETTLSQERDSTIFGIR 
GKKLHFSSS YAHKASS IAKSSSFCSMERDEBDS 1 1 VSEG I IEEY 
lAFDHIDIEEGFHGKKSEAATEKQSCLGYPPIAPFYCMKEDVLAY 
VFDSVWCKWSCMEQLTRSHWEGFASDDESNVAVTRPDSESSCV 
LS E LH PL VL PR VPQ S KVL Y ITSNP MS LCQ AS RHQ PNVNDLLVHG 

WPLQPRNLSLMDKLLDLDDKLLMRPGSSTILSTRNWPNRAVEFS 
TSSLSYTVQSTRRRNPPPRTLHPISTSKSCAETPRSVEEIIjRGA 
RVPVAPDSLSSPSPTPLSRNNLLPPIGTABVEHVSTVGPQRQMK 
PHGDSSRAQSAWDEPNYQQPQERLLLPDFFPRPNTTQSFLLDT 

qyrrscaveyphqarpgrgsagpqlhgstksqsggrpvsrtrqg 


6291 


1732 


602 


bV7\KMASSASARTPAGKRVINQEELRRljMKEKQRLSTSRKRiE"s^ I 
PFAKYNRLGQLSCALCNTPVKSBLLWQTHVLGKQHREKVAELKG 
AKEASQGS SASSAPQS VKRKAPDADDQDVKRAKATLVPQVQPST 
SAWTTNFD KIGKE Fl RATPS KPSGLSLLPDYEDEE EEEEEEEGD 
GBRKRGDASKPLSDAQGKEHSVSSSREVTSSVLPNDFFSTNPPK 
APIIPHSGSIEKAEIHEKWERRENTAEALPEGFFDDPEVDARV 
RKVDAPKDQPTOKEWDEFQKAMRQVNTI SEAI VAEEDEEGRLDRQ 
IGEIDEQIECYRRVEKLRNRQDEIKNKLKEILTIKELQKKEEEN 
ADS DDEGE LQDLLSQDWRVKGALL | 


6292 


1835 


1142 


TCPGAMKMVAPWTRFYSNSCCLCCHVRTGTILLGVWYLI INAVvH 
LL 1 LLSALADPDQYNFSSSELGGDFE FMDDANMCIAI AISLLM 1 
LI CAMATYGAYKQRAAW 1 1 P FFCYQI FD FALNMLVAI TVL 1 Y PN 
S I Q E Y I RQLPPNFP YRDDVMS VNPTCL VL I I LLFI S I ILTFKG Y 
LIS CVWNCYRY INGRNS SDVLVYVTSNDTTVLLPPYDDATVNGA 
AKBPPPPYVSA J 


6293 


2382 


103$ 


FWCTLGTVDVHPIGWCAINSKILVPPRTIHAKFTDWKGYLMKRL 
VGSRTLPVDFHIKMVESMKYPFRQGMRLE WDKSQVS RTRMAW 
DTVIGGRLRLL YEDGDSDDDFMCHMWS PL I HPVGWSRRVGHG IK 
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SBQ 
IP 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, CoCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
HoHistidine, I=Isoleucine, KsLysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=*Glutamine, R«Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X« unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








MSERRSDMAHHPTFRKI^CDAVPYLFKKVRAVYTEGGWFEEGMK 
LEAI DPLNIiGNI CVAT VCKVLLDG YLM I CVDGGPSTDGLDWFCY 

hasshai fpatfcqkndieltppkg yeaqtfnwenylektkska 
apsrlfnmix:pnhgfkvgmkleavdlmeprlicvatvkrvvhrl 

LS IHFDG WDS EYDQ WVDCE S PD I YP VG WCELTG YQLQ PP VAAE P 
ATPLKAKEATKKKKKQFGKKRXRI PPTKTRPLRQGSKKPLLEDD 
PQGARKIS S BP VPGB I IAVRVKEEHLDVASPDKAS 5 PELPVS VE 
NIKQETDD 


6294 


354 


1814 


AQLTTRGRTVAGGVRWIPSPFPDLBLySCCLGTDRGFPEIiSHHC 
KNV IATAS DYDNAE I TNI R PS FDVS P WAGLIGAS VLWCVSVT 
VFVWS CCHQQAEKKHKNPP YKFIHMLKG 1 5 1 YPETLSNKKKI I K 
VRRDKDGPGREGGRRNLLVDAAEAGIiLSRDKDPRGPSSGSCIDQ 
LP IKMDYGEELRSP1TSLTPGBSKTT3PSS PEEDVMLGSLTFSV 
DYNFPKKAbWTIQEAHGLPVMDDQTQGSDPYlKMTILPDKRHR 
VKTRVLRKTLDPVFDETFTFYGIPYSQLQDLVLHFLVLSFDRFS 
RDDVIGEVMVPLAGVDPSTGKVQLTRDI I KRNIQKCI S RG ELQV 
SLS YQPVAQRMTVWLKARHJ^KMDIAGLSGNPYVKVNVYYGRK 
RIAKKKTHVKKCTLNPIFlTESFIYDIPTDIiPDISIEFliVIDFD 
RTTKNEWGRLILGAHSVTASGAEHWREVCESPRKPVAKWHSLS 
EY 


629* 


2795 

• 


617 

•i 


VS S ALLTGATSGSDAAKS EGASAS PLSCTNAVAMDRPDEGP PA K 
TRRLSSSESPQRDPPPPPPPPPLLRLPLPPPQQRPRLQEETEAA 
QVIiADMRGVGLGPALPPPPPWILEEGGIRAYFTLGAECPGWDS 
TIESGYGEAPPPTESLEALPTPEASGGSLEIDFQWQSSSPGGE 
GALETCS AVGWAPQRLVDPKS KBEAI 1 I VBDEDEDERES MRS SR 
RRRRRRRRKQRKVKRESRERNAERMES ILQALEDIQLDLEAVNI 
KAGKAFLRLKRKFIQMRRPFLERRDLI IQHIPGFWVXAFLNHPR 
IS ILINRRDEDIFRYLTNLQVQDLRHISMGYKMKLYFOTNPYFT 
NMVIVKBFQRNRSGRLVSHSTPIRWHRGQEPQARRHGNQDASHS 
FFSWFSNHSLPEADRIAEIIKNDLWVNPIiRYYLRERGSRIKRKK 
QEMKKRKTRGRCEWI MEDAPDYYAVEDI FSEISDIDETIHDI K 
ISDFMETTDYFETTDNEITDINENICDSENPDHNEVPNNETTDN 
l^ADDHETTDNNESADDNNENPBDNNKNTDDNEENPNN^ 
GNNFFKGGFWGSHGNNQDSSDSDNBADEASDDEDNDGNEGDNEG 
SDDDGNEGDNEGSDDDDRD I E YYEKVI EDFDKDQADYEDVI E 1 1 
SDESVEEEGIEEGIQQDEDI YEEGNYEEEGSEDVWEEGEDS DDS 
DLEDVLQVPNGWANPGKRGKTG 


6296 


727 


1199 


RHCGCDAQGACDS LP PTGTS S P VTARN AI PEARCC VWLLDG TT V 
EAVRPARERLARKELRQKRMQQFSRDSAYSSNKDSTCLLTERDT 
LGTS LQ FPS PFSGTI S FGS FS DS (3 1 FPLGSQCCLG FQQ FS I SGK 
KWALIHKRVRLS VFGARWGRI YFGK 


6297 


1 


922 


QRAAAAS PSSCGPRGAEYGALMAMEGYWRFIiALLGSALLVG FLS 
VI FALVMVIjHYREGIiGWDGSAIjEFNWHPVIiMVTO FVF I QG XAI I 
VYRLPWTWKCSKLLMKSIHAGLNAVAAILAIISVVAVFENHNVN 
WIANMYSLHS WVGLIAVI CYLLQLLSQFS VFLLPHAPLSLRAFL 
pjrxnv ism vxtisi viai/UjMGIjTEKIiIFSIiRDPAYSTFPPEGV 
FVNTLGLLI LVFGALI FWI VTRPQWKRPKBPNSTILHPNGGTEQ 

gargsmpaysgnnmdksdselnnevaarkrklaldeagqrstm 


6298 


3 


985 


SVPLRIUiSLSGTI^AGTTTKMAVARIiAAVAAWVPCRSWGWAAV 

pfgphrglsvllaripqraprwlpacrqktslsflnrpdlpnla 
ykklkgkspgiifipgyiisymngtkalaieefckslghacirfd 

YSGVGSSDGNSEESTLGKWRKDVLS I IDDLADGPQI LVGS SLGG 
WLMLHAAI AR PEKWALIGVATAADTL VTKFKQLP VBL KKEVEM 
KGVWSMPSKYSEEGVYNVQYSFIKEAEHHCLLHSPIPVNCPIRL 
LHGMKDD I VPWHTSMQ VADRVLSTDVDV I LRKHSDHRMRE KADI 
QLLVYTIDDLIDKLSTIVN 
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SEQ 
ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


I Predicted end 
j nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anino acid segment containing signal peptia*e~~ 
\A»ftidnine, v-ouyscexne, D^Aspartic Acid, E= 
Glutamic Acid, P= Phenyl alanine, G=Glycine, 
HaHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 

W=Tryptophan, Y»Tyrosine, X -Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 


6299 


512 - 


814 


ECDLEGIMPNVTISLSLPTNGSPLQDILVHPCVTSLDSArLTS's"' 
UJ - uni »wuonroorixy:rr ir'riJoorivLA.r x 1£»UVPVPPILGFYQ 
MKEE EVQLRNNH 


6300 


121 


692 


AAPSCWSQRGVPAAGTPSSPRLLVSRAAAPSAGPWGAWRQGARA 
AQSP?SIPNSSSVPYGSQDSVHSSPEDGGGGRDRPVGGSPGGPR 
LVIGSLPAHLS PHMFGGPKCPVCSKPVSSDEMDIiHLVMCLTKPR 
ITYNEDVLSKDAGECAICLEEI^QGDTIARLPCLCIYHKGCIDE 
WFEVNRSCPEHPSD 


6301 


616 


284 


GKFVPVNWEPPQPLPFPKi-LRCYRCIiLETKBLGriLGSbiCLtP 
AGSSCITLHKKNSSGSDVMVSDCRSKEQMSDCSNTRTSPVSGPW 
IFSQYCFLDFCNDPQNRGLYTP 


6302 


490 


! 745 


IFGFLHLFHMEHSFLLVCALFAHVFFSSSCGSSVALHSDPCLLS 
PVLLNCLPGDLRPLDELYAQKLKYKAISEELDHALNDMTSL 


6303 


2 


1961 


YWNEYGGGLLWQSWQEKHPGQALS5EPWNFPDTXEEWEQHYSQL 
YWYYLEQFQYWEAQGWTFDASQSCDTiyrYTSKTEADDKNDEKCM 
KVDLVS FLSSP IMGDNDSSGTSDKDHSEILDGISNIKLNSEEVT 
QSQLDS CTSHDGHQQLS EVS S KRECPASGQSE PRNGGTNE ESNS 
SGNTNTDPPAEDSQKSSGANTSKDRPHASGTDGDESEEDPPEHK 
PSKLKRSHELD IDENPASDFDDSGSLLGFKYGSGQKYGGI PNFS 
HRQ VR YLEKNVKL KSKYLDMRRQIKMKNKHIFFTKES EKP FFKK 
SKILSKVEKFLTWVNKPMDEEASQESSSHDNGHDASTSCDSEEQ 
DMSVKKGDDLLETlWPEPEKCQSVSSAGEIiETENYKRDSLLATV 
PDEODCVTQEVPDSRQAETEAEVKKKKNKKKNKKVNGLPPEIAA 
VPELAKYWAQRYRLFSRFDDG IKLDREGWFSVTPEKIAEH IAGR 
VSQS FKCDVWDAFCGVGGNT I QFALTGMR VI AID I DP VKI ALA 
RNFNAEVYGIADKIEFlCGDFliIiASFLKADVVFIiSPPWGGPDYA 
TABT FD IRTMMS PDG FBI FRI*S KKITNN IVYFLPRNADIDQVAS 
LAGPGGQVEIEQNFLNNKLKTITAYFGDLIRRPASET 


6304 


1 


1438 


HRAR VDRS RES PGGDLRHPGRVRRD I TLSGHFRLS TQH V VLLR E 
DEVGDPGTKDLGHPQHGS P IQETQSEWTLVSPIiPGSDMAAIiPA 
NRATSGLTLWPHTAEGRDLLGAENRALTGGQOAEDPTLASGAYQ 
WPGSVEKI^GSWCDAETLLSSSRTGGQAPPWLTDHDVQMLRIiL 
AQ3EVVDKAR VPAHGQVLQVGFSTEAALQDLSS PRLSQLCS QGL 
CGLI KR PGDLP E VLS FHVDRVLGLRRS LPAVARRFHS PLLP YRY 
TDGGAR P V I WW APD VQKLS D PD EDQNS LALGWLQ YQALLAH S CN 
WPGOAPCPG IHHTE WARLAL FDFLLQVHDRLDRYCCG FEPEPSD 
PCVEERLRE KCRNPAELRLVHILVRS SD PSHLVYIDNAGNLQHP 
EDKLNFRLLEGIDGFPESAVKVLMGCLQNMLLKSIiQPMJPVFWE 
SQGGAQGLKQVLQTLEQRGQVLLGHIQKHNLTLFRDEDP 


6305 


99 


420 


nmiwrgrstyrprprrs^pppeligpmlepgdeepqqeepptes 
rdpapgqereedqgaaetqvpdleadlqelsqsktgdecgdgpd 

VQGKILTKS EQFKM PEGR 


6306 


1 


1B74 


PTRPSKVKVPHTFLIHSYTRPTVCQACKKLLKGLFRQGLQCKDC 
KFNCHKRCATRVPNDCLGEALINGDVPMEEATDFSEADKSALMD 
ESEDSGVIPGSHSENALHASEEEEGEGGKAQSSLGY1PLMRWQ 
SVRHTTRKSSTTLREGWVVHYSOTOJTLRKRHYWRLDCKCITLFQ 
NNTTNR Y YKE I PLS E I LIVES AQNFSLVPPGTNPHCFE I VTANA 
TYFVGEM PGGTPGGPS GQGAEAARGWETAIRQALMPVILQDAPS 
APGHAPHRQASIjS ISVSNSQI QENVDIATVYQI FPDEVLGSGQF 
GWYGGKHRKTGRDVAVKVIDKLRF PTKQE S QLRN EVA I LQS LR 

HPGIVNLECMFETPEKVFVVMEKLHGDPILEMrLSSEKGRLPERL 
TKFLITQI LVALRHLHFKNIVHCDLKPENVLLASADPFPQVKLC 
DFGFARI IGEKSFRRSWGTPAiXAPEVLLNQGYNRSLDMWSVG 
VIMYVSLSGTFPFNEDED INDQ IQNAAFMYPASPWSHISAGAID 
LINNLLQVKMR KRYS VDKS I*SH PWLQEYQTWLDLRELEGKMGER 
YITHESDDARWEQFAAEHPLPGSGLPTDRDLGGACPPQDHDMQG 
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SEQ 
ID 
NO: 


1 Predicted " 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptide 
(AaAlanine, C-Cysteine, EhAspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H-Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=»Methionine, N=Asparagine , 
P^Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 
LAERISVL 


6307 


2136 


589 


CFLLPRGRDPEPPEAGAAAPCAPGAPD^SFRKVVRQSKPRiiVFG 
QPVFCNDOCYEDIRVSRVTWDSTFCAVNP KFXAVT VEASGGGAFI* 
VLPLSKTGRIDKAYPTVCGHTGPVLDIDWCPHNDEVIASGSEDC 
TVMVWQ I PBNGLTSP LTE PVWLEGHTKRVG 1 I AWHPTARNVLI* 
SAGCDNVVLIWNVGTAEBLYRLDSLHPDLI YNVSWNHNGS LFCS 
ACKDKSVRIIDPRRGTLVAEREKAHBGARPMRAIFLADGKVFTT 
GPSRMSERQLALWDPBNLEEPMALQELDSSNGAIiLPFYDPDTSV 
VY VCGKGDS S 1 RYFE I TEE PPY I HFLNTFTSXEPQRGMGS MPKR 
GLEVS KCE IARFrKLHERKCEP I VMTVPRKSDLFQDDLYPDTAG 
PEAALEAEEWSGRDADPILISLREAYVPSKQRDLKISRRNVLS 
DSRPAMAPGSSHLGAPASTTTAADATPSGSIiARAGEAGKLEEVM 
QELRALRALVKEQGDRICRLEEQLGRMENGUA 


6308 


2 


111B 


GRPTRPEKMLLSLVLHTySMRYLLPSVVLLGTAP^YVLAWGVWR ' 
LLSAFLPARFYQALDDRLYCVYQSMVLFFFENYTGVQILLYGDIi 
PKNKENI IYIANHQSTVDWIVADILAIRQNALGHVRYVLKEGLK 
WLPLYGWYFAQHGGIYVKR5AKFNE3CEMRNKLQSYVDAGTPMYI* 
VIFPEGTRYNPEQTKVLSASQAFAAQRGLAVLKHVLTPRIKATH 
VAFDCMKNYLDAI YDVT W YEGKDDGGORRES PTMTBFLCKECP 
KIHIHIDRIDKKDVPEEQEHMRRWLHERFEIKDKMLIEFYESPD 
PERRKRF PGKS VNS KLS 1 KKTLPSMLILSGLTAGMZiMTDAGRKL 
YVNTWIYGTLLGCLWVTIKA 


6309 


220 


563 


LVAEVKE PCS LPMLS VDMENKENGS VGVKNSMENGR PPDPAD WA 
VMDWNY FRTVGFEEQAS AFQEQE I DGKSLLLMTRNDVLTGLQL 
KLGPALKIYEYHVKPLQTKHI.KNNSS 


6310 


36 


979 


GPRCWKFLIJbSSVNCErrl,RIGKAWPQS6GQERYWTPRTHSSAS2' 
AQRGSLAELNVAAAG LWADCDQPLYDC PMCGL I CTWYHILQEHV 
DLHLEENS FQQG MD RVQCS GDLQLAlIOLOOEEDR K R R q PF<5 p op 
IEEFQKLQRQYGliDNSGGYKQQQLRNME IEVNRGRMPPSEFHRR 
KADMMESLALGFDDGKTKTSGI X EALHR YYQNAATDVRRVWLSS 
VVDHraSSLGDKGWGCGYRNFQMLIiSSLLQNDAYNDCrjKGMLIP 
CI PKIQSMI EDAWKEGFD PQGASQLI IRLQGTKAWIGACEVYIL 
LTSLRV 


6311 


1 


675 


P VWWNS CEG PRLAAAARTGHGVGRRARIACLGEPRVKAAVKIiTL 
AS KLKRDDGLKGSRTAATASDSTRRVSVRDKLLVKEVAELEANL 
PCTCKVHFPDPNKIjHCFQLTVTPDEGYYQGGKFQFETEVPDAYIJ 
MVP PKVKCLTKI WHPNTTETGE I CLS LLREHS XDGTGWAPTRTIj 

IQ0VVWGLNSLFTDLLNFDDPLNIEAABHHIJ2DKEDFRNKVDDYI 
KRYAR 


6312 


213 


1400 


GDEbVKREAGMKMLPGVGVFGTGSSARVLVPIJjRABGFTVEALW 
GKTEEEAKQLAEEMNI AFYTS RTDD ILLHQDVDLVC I S I PPPLT 
RQI S VKALG I GKNWCEKAATS VDAFRM VTAS RYY PQLMS L VGN 
VLRFLPAFVRMKQLISEHYVGAVMI CDARI YSGSLLS PS YGWI C 
DELMGGGGIxHTMGTYIVDLLTHLTGRRAEKVHGLLKTFVRQNAA 
IRGIRHVTSDDFCFFQMLMGGGVCSrVTLNFNMPGAFVHEVMVV 
GSAGRLVARGADLYGQKNSATQEELLLRDSLAVGAGLPEQGPQD 
VPLLYLKGMVYMVQALRQSFQGQGDRRTWDRTPVSMAASFEDGL 
YMQSWDAI KRSSRSGEWEAVEVLTEEPDTNQNIiCEALQRNNL 


6313 


2 


2071 


QRSGAARLAFLPSPFS PACVHRSPLS FHGCWFYFWVFMPLGVL 
FHRRRAHGCTLSCSSFVEQPTAMEAEETIWECIiQEFPEHKKMI^ 
RI^EQREQDRFTDITLIVXXSHHFKAHKAVLAACSKFFYKFFQEF 
TQEPLVEI EGVS KMAFRHLIEFT YTAKLMIQGEEEANDVWKAAE 
FLQMLEAI KAIiE VRNKENSAPLEENTTGKNRAKKRKIAETSNVI 
TESIiPSAESEPVEIEVElAEGTIEVEDEGIETLEEVASAKQSVK 
YIQSTGSSDDSALALLADITSKYRQGDRKGQIKEDGCPSDPTSK 
QVEGIEIVEI^LSHVKDLFHCEKCNRSFKLPYHFKEHMKSHSTE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid . 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H»Histidine, l=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=>Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S»Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y»Tyrosine, X»Unknown, +=Stop 
Codon, /apossible nucleotide deletion, 
\»possible nucleotide insertion) 








S FKCE I CNKR YLRES AWKQHLNCYHIjE EGG VS KKQRTGKK IH VC 
QYCEKQFDHFGHFXEHLRKHTGEKPFECPNCHERFARNSTLKCH 
LTACQTGVGAKKGRKKLYECQVCNSVFNSWDQFKDHLVIHTGDK 
PNHCTLCDLWFMQGNELRRHLSDAHKrSERLVTEEVLSVETRVQ 
TE P VTSMTI I EQVGKVHVLPLLQVQVDS AQVT VEQVHPDI*IiQDS 
QVHDSHMSBLPEQVQVSYLEVGRIQTBEGTEVHVEELHVERVNQ 
M P VEVQTELLEADLDHVT PE I MNQEE RE5SQADAAE AARE DHED 
AEDLETKPTVDSEAEKAENEDRTALPVLE 


6314 


2 


2071 


qrsgaarlaflpspfspacvhrsplsfhgcwfyfvwfmplGvl 
fhrrrahgctiiscssfveqptameaeetmeclqefpehhkmlld 
RLNEQREQDRFTDITLIVDGHHFKAHKAVLAACS kffykffqef 
TQE P LVE I EG VS KMAFRHLI EFTYTAKLMIQG EEEAND VW KAAE 
FLQMLEAI KALE VRNKENSAPLEENTTGKNEAKKRKIAETSNVI 
TESLPSAESEPVEIEVEIAEGTIEVEDEGIETLEEVASAKQSVK 
YIQSTGSSDDSALAIiLADITSKYRQGDRKGQIKEDGCPSDPTSK 
QVEGIEIVELQIiSHVKDLFHCEKCNRSFKLFYHFKEHMKSHSTE 
SFKCEICNKRYLRESAWKQHLNCYHLEEGGVSKKQRTGKKIHVC 
QYCEKQFDHFGHFKEHLRKHTGEKPFECPNCHERFARNSTLKCH 
LTACQTGVGAKKGRKKLYECQVCNSVFNSWDQFKDHLVIHTGDK 
PNHCTLCDLW FMQGNELRRHLSDAHN I S ERLVTEE VLS VE TRVQ 
TEPVTS MTI I EQVGKVHVLPLLQVQVDSAQ VTVEQVHPDL LQDS 
QVHDSHMSELPEQVQVS YLE VGRI QTEEGTEVHVEELHVERVNQ 
MPVEVQTELLEADLDHVTPEIMNOEERESSQADAAEAAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVLE 


6315 


1 


1015 

■r 


LGLAVNWTTL VLI S YCPTATEEAP Y WTYLLCALGLFI YQSLDA 
IDGXQARRTNS CSPLGELFDHGCDSLS TVFMAVGAS IAARLGTY 
PDWFFSCSFIGMFVFYCAHWQTYVSGMLRFGKVDVTEIQIALVX 
VFVLSAFGGATMWDYTIPILBIKLKI LPVLGFLGGVI FSCSNYF 
HVILHGGVGKNGST I AGTS VLS PGLHIGLI 1 1 LA I M I YKKSATD 
VPEKHPCLYILMTOCVFAKVSQKLWAHMTKSELYIiQDTVFLGP 
GLLFIxDQYPNNFIDEYVVLWMAMVISSFDMVIYFSALCLQlSRH 
LHLNI FKTACHQAPEQVQVLSS KSHQNNMD 


"6316 


1503 


792 


Vsagagtgj:^gttstrrvtfeadenenitvvk6irlsenvidr 

MKESSPSGSKSQRYSGAYGASVSDEELKRRVAEELALEQAKKES 
EDQKRLKOAKELDRERAAANEQLTRAILRERI CSEEERAKAKHL 
ARQLEEKDRVLKKQDAFYKEQLARLEERSSE FYRVTTEQYQKAA 
EEVEAKFKRYESHPVCADLOAKILQCYRENTHQTLKCSAIiATQY 
MHCVNHAKQSMLEKGG 


6317 


102 


839 


PEAQTSAVLAREKGHLPTMRHEAPMQMASAQDARYGQKDSSDQN 
FDYMFKLL I IGNSSVGKTS FLFRYADDS FTSAFVST VGIDPKVK 
TVFKNEKRIKIiQIWDTAGQERYRTITTAYYRGAMGFlLMYDlTN 
EES FNAVQD WSTQI KT YSWDNAQVILVGNKCDMEDERVISTE RG 
QHIjGEQLGFEFFETSAKDNINVKQTFBRLVDIICDKMSESEiETD 
PAITAAKQNTRLKETPPPPQPNCAC 


631B 


1765 


733 


PWHPLRTLPLHHPHPRPPRABGREGADSMSHLPGLELRREAPPL 
LGPLLS P F PIjPAGS WHRQMLRSS LRFPI TNS AGAPCXAAGRMNI 
LAP VRRDR VLAEL P Q CLRKEAALHGH KD FHPR VTCACQEHRTG T 
VGFKISKVIVVGDLSVGKTCLXNRFCKDTFDKNYKATIGVDFEM 
ERFE VLGI PFSLQLWDTAGQERFKCIASTY YRGAQAI I IVFNLN 
DVAS L EHT KQWLADALKENDPSS VLLFLVGS KKDLSTPAQ YALM 
E KDALQVAQEMKAE YWAVSSLTG ENVREFFFRVAALTFEANVLA 
ELEKSGARRIGDWRINSDDSNLYLTASKKKPTCCP 


6319 


88 


717 


AATMRI^QNTLIJ^KKVVLVPYTSEHVPSRYkEWMKSBELQRLT 
AS EPLTLEQE YAMQ CS WQEDADKCTFI VLDAEKWQ AQPGATEES 
CMVGDVNLFLTDLEDLTLGEIEVMIAEPSCRGKGLGTEAVLAML 
SYGVTTLGLTKFEAKIGQGNBPSIRMFQKLHFEQVATSSVFQEV 
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SEQ 
ID . 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
{A«Alanine, C-Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
HaHistidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine , R=Arginine, 
S=Serine, T»Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X«Unknown, *ogtop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLRLTVS E S EHQWLIiEQTSHVE EKP YRDGS AE PC 


6320 


90 


1111 


RPRTGRBKVAMAAVDS FYLLYRE I ARSCN CYMEALALVGAWYTA 
RKSITVICDFYSLIRLHFIPJRLGSRADLIKQYGRWAWSGATDG 
IGKAYABELASRGLNI ILISRNEEKLQWAKDIADTYKVETDI I 
VADFSSGRE I YLP I REALKDKD VG I LVNNVGVFYP YPQYFTQLS 
EDKLWDI INVNIAAASLMVHWLPGMVERKKGAIVTISSGSCCK 
PTPQLAAFSASKAYLDHPSRALQYEYASKG1FVQSLIPFYVATS 
MTAPSNFE,HRCSWLVPSPKVYAHHAVSTLGISKRTTGYWSHSIQ 
FLFAQYMPEWLWVWGANI LNRSLRKEALSCTA 


6321 


1418 


341 


HRKAALGALMAGRIJ^KAl^VSLSIJUaASVTIRSSRCRGIQAF 
RNSFSSSWFKXNTNVMSGSNGSKENSHNKARTSPYPGSKVERSQ 
V PNEKVG WLVE WQD YKPVE YTAVS VLAG PRWADPQ IS ESNFSP K 
FNE KD GHVE R KS KNGL YE I ENGR P RN P AGRTGLVGRGLLGRWG P 
NHAADPIITRWKRDSSGWKIMHPVSGKHILQFVAIKRKDCGEWA 
IPGGMVDPGEKISATLKREFGEBALNSLQKTSAEKREIEEKLHK 
LFS QDHLV I YKGYVDD PRNTDNAWMETEAVNYHDBTGE I MDNLM 

LEAGDDAGKVKWVDINDKLKLYASHSQFIKLVAEKRDAHWSEDS 
EADCHAL 


6322 


2047 


1083 


NQEI LKNVES SRTVQPHFLEPLL3LGWSVDVGRHPGWTGHVSTS 
WS INCCDDGEGSQQEEVISSEDIGAS I FNGQXKVL YYADALTE I 
AFWPSPVESLTDSLBSNISDQDSDSNMDLMPGIIiKQPSLTLEL 
FPNHTDNLNSSQRLSPSSRMRKLPQGRPVPPLGPETRVSWWVE 
R YDD IENFPLS ELMTE I STGVETTANS STSLRSTTLEKEV PVI F 
IHPLNTGLFRIKIQGATGKFNMVIPLVDGMIVSRRALGFLVRQT 
VINICRRKRLESDSYSPPHVRRKQKITDIVNKYRNKQLEPEFYT 
SLFQEVGLKNCSS 


6323 


1 


656 


PASTTDGAQE ARVPUX3AFW I PRP PAGS PKGCFAC VS KPP ALQA 
PAAPAPEPS AS P PMAPTLFPMESKS S KTDSVRAAGAP PACKHLA 
EKKTMTNPTTVI £ VYPDTTEVNDYYLWSI FNFVYLN FCCLGFI A 
LAYSLKVRDKKLLNDLNGAVEDAKTDRLINI TRS GLAAS C I ML W 
MALS VIATHR GhRSSAS I L VAEPHDWNTERPQ VTFR ER CPAL 


6324 


X 


2061 


EGAGMRRCPCRGSLNEAEAGALPAAARMGLEAPRGGRRRQPGQQ 
RPGPGAGAPAGRPEGGGPWARTEGSSLHSEPERAGLGPAPGTES 
PQAEFWTDGQTEPAAAGLGVETERPKQKTEPDRSSLRTHLEWSW 
SELGTTCLWTETGTDGLWTDPHRSDLQFQPEEAS PWTQPGVHGP 
WTELETHGSQTQPERVKSWADNLWTHONSSSLQTHPEGACPSKE 
PSADGSWKELYTDGSRTQODIEGPWTEPYTDGSQKKQDTEAARK 
Q PGTGGFQ IQQDTDGSWTQPSTDGSQTAPGTDCLLGB PEDG PLE 
EPEPGELLTHLYSHLKCSPLCPVPRLIITPETPEPEAQPVGPPS 
RVEGGSGGFSSASSFDESEDDWAGGGGASDPEDRSGSKPWKKL 
KTVLKYSPFWS FRKHYP WVQLSGHAGNFQAGBDGR I LKRFCQC 
EQR3LEQLMKDPLRPFVPAYYGMVLQDGQTFNQMEDLLADFEGP 
SIMDCKMGSRTYLEEELVKARBRPRPRKDMYEKMVAVDPGAPTP 
EEHAQGAVTKPRYMQWRETMSSTSTLGFRIBGIKKADGTCN'rNF 
KKTQALEQVTKVLEDFVDGDHVI LQKYVACLEELREALEI SPFF 
ivinis v VL»i»i>ijLtr VHJJH xXjXaAJwWMIDFGKTVALPDHQTLSHRLP 
WAEGNREDGYLWGLDNMICLLQGLAQS 


6325 


165 


944 


GLRDP FRRKRRLKPQVKMSN YVNDMWPGS PQEKDSPS TSRSGGS 
S RLS S RSRSRS FS R3SRSHSRVSS RFS S RS RRS KS RS RSRRRHQ 
RKYRRYSRS YSRSRSRSRSRRYRERRYGFTRRYYRS PSRYRSRS 
RS RS RS RGRS YCGRAYAI ARGQR YYGFGRTV YPEEHSRWRDRS R 
TRSRSRTPFRI^EKDRMELLEIAKTNAAKALGTTNIDLPASLRT 
VPSAKETSRGIGVSSNGAKPEVSILGLSEQNFQKANCQI 


6326 


238 


680 


GEPSPATQQKPSATGAGVLHQHFSSGHIYVLMGLIiPPPWTISFT" 
VQTTLQPPGGLPAAPVSGRMAFEPVGRDLARRMVPRAGKRTQTL 
GARRVAAQGAR PL PEDRRP KSGERLH VTVAPCWEFVLPS VS LTA 
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ID 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A«Alanine, C»Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, FoPhenylalanine, G=Glycine, 
H=Histidine, I=Ieoleucine, KsLysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, QoGlutamine, RaArginine, 
SoSerine, ToThreonine, V»Valine, 
WtsTryptophan, Y«Tyrosine, X*Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*possible nucleotide insertion) 








QAWGG VGQE AS SGV P 


6327 


1 


1337 


SI^I^PAGGS^PTQQPAAPSTRAPKPSRSlJSGSrCALFSDA 
DSGSGMKAELPPGPGAVGREMTXEEKLQLRKEKKQQKKKRBDEEK 
GAEPETGSAVSAAQCQGPTRELPESGIQLGTPRBKVPAGRSKAE 
LRAERRAKQEAERALKQARKGBQGGPPPKASPSTAGETPSGVKR 
LPEYPQVDDLLLRRLVKKPERQQVPTRKDYGSKVSLFSHLPQYS 
RQNSLTQFMSIPSSVIHPAMVRLGLQYSQGLVRGSNARCIALLR 
ALOQVIQDYTTPPNEELSRDLVNKLKPYMSPLTQCRPLSASMHW 
AIKPIiNKEITSVCSSKREEEAKSELRAAIDRYVQEKIVLAAOAI 
S RFAYQKI SNGD VILVYGCSS LVSRILQEAWTEGRR FRWWDS 
R P WLEGRHTLRSL VHAG VPAS YLLIPAAS YVLPEVS TEEKDS KV 
GGEKV 


1 6328 


1030 


276 


HASAE VTTAAARGLGAM EEEMHTDAK I RAENGTGS SPRGPGCSIi 
RHFAC EQNLLSR PDGS AS FLQGDTS VLAG VYGPABVKVSKE I FN 
KATLEVIIiRPKIGLPGVAEKSRERLIRNTCEAWLGTIiHPRTSI 
TWLQWS DAGS LLACCLNAACMALVDAGVPMRALFCGVACALD 
SDGTLVLDPTSKQEKEARAVLTFALDSVERKLLMSSTKGLiYSDT 
ELQQCLAAAQAASQHVFRFYRESLQRRYSKS ! 


6329 


3 


2016 


SS EVAAGGGTRSAMAEGSGE VVTVSATGAANGLNNGAGGTSATT 
SNPLSRKLHKILETRLDNDKEMLEALKALSTFFVENSLRTRRNL 
RGDIERKSLAINEEFVS I FKEVKEELESIS EDVQAMSNCCQDMT 
S RLQAAKEQTQDLIVKTTKLQ SESQKLE I RAQ VABAFLS KFQLT 
S DEMS LLRGTREGP I TEDFFKALGRVKQI HNDVKVLLRTNQQTA 
G IiE IMEQMALLQETAYERLYRWAQSECRTLTQES CDVS PVLTQA 
MEALQDRPVLYKYTLDEFGTARRSTWRGFIDALTRGGPGGTPR 
P I EMHS HDPLR YVGDMLAWLHQATASE KEH LEALLKHVTTQGVE 
ENrQBWGHITEGVCRPLKVRIEQVIVAEPGAVr*LYKISNLLKF 
YHHTISGIVGNSATALLTTIEEMHIjLSKKIFFNS1»SIjHASKIjMD 

kvelpp pdlgpssalnqtlmllrevlashdss wpujarqadfv 

qvi^cvldpi^mctvsasnlgtadmatfmvnsd^ 

eftdrrlemlqf^ieahldtlineqasyvltrvglsyiyntvqq 

HKPEQGSLANMPNLDSVTLKAAMVQFDRYLSAPDNIiLI PQLNFL 
LSATVKEQIVKQSTBLVCRAYGEVYAAVMNPINEYKDPENILHR 
SPQQVQTLLS 


6330 


11S1 


333 


FFYYTFYENKTFSRKMVAEKETI1SI1NKCPDKMPKRTKLLAQQPL 
PVHQPHSLVSEGFTVKAMMKNSVVRGPPAAGAFKERPTKPTAFR 
KFYERGDFPIALEHDSKGNKIAWKVEIEKLDYHHYIiPLFFDGLC 
EMTFPYEFFARQGIHDMJjEHGGNKI LP VIiPQLI I PIKNALNLRN 
RQV I CVTLKVLQHLWS AEMVGKALVP YYRQ I LP VLNI FKNMNV 
NSGDGIDYSQQKRENIGDLIQETLEAFERYGGENAFINIKYWP 
TYESCLLN 


6331 


3 


495 


QQGQRVRTRGRRACASATPLEGCVDLSYPRTHAAIjLKVAQMVTL 
LIAFICVRSSLWTNYSAYSYFEWTICDLIMILAFYLVHLFRFY 
RVLTCISWPLSELLHYLIGTLLLLIASIVAASKSYNQSGLVAGA 
I FG FMATFLCMASIWLS Y FCI S CVTQST DAAV 


6332 


1 


87B 


.VTESNKFDJLVSFIPLLRERI YSNNQYARQFI IS WILVLES VPDI 
NLLDYLPEILDGLPQILGDNGKEIRKMCEWLGEFLKEIKKNPS 
S VKFAEMANI LVIHCQTTDDLIQLTAMCWMREF I QLAGRVMLP Y 
S SGILTAVLP CLAYDDR KKS I KEVANVCNQSLM KLVTPEDDELD 
BLRPGQRQAEPTPDDALPKQEGTASGEWTPSLHLTSCRGPREPD 
VIGVALGPHLSNQDYFMYVTHTIVAATQRSGSSGSPPFCRQDTG 
KLSTMATHSQLVKTGTGLEPRQAVSSSH 


6333 


3 


1467 | 


TRTPSEAEAGGESPQSCVSAAHSDMTAGKPVSLLAPLIPPRSAG " 
QPLTFS PSGRQ PLRSLLVGMCSGSGRRRSSLS PTMRPGTGAERG 
GLMMGHPGMHYAPMGMHPMGQRANMPPVPHGMMPQMMPPMGGPP 
MGQMPG^5MSSVMPGMMMSHMSQAS^iQPAIiPPGVNSMDVAAGTAS 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide" 
(A=Alanine, C=Cysteine ( D=Aspartic Acid; E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H-Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PoProline, Q=Glut amine, R«Arginine, 
o-aerme, i -.inreonine , Vavaline, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








GAKS MWTEKKS PDGRTYY YNTB TKQSTW B KPDDLKT PAEQLLS K 
u f w its. r KS DSGKP YY YNSQTKESRWAKPKELEDtiEGYQNT XVAG 
SLITKSNLHAMIKABESSKQBECTTTSTAPVPTTEIPTTMSTMA 
j AAEAAAAWAAAAAAAAAAAAANAHASTSASNTVSGTVPWPBP 
EVTS I VATWDNENTVTI STBEQAQLTSTPAI QDQS VEVS SNTG 
BETS KQETVADFTPKKEEEESQPAKKTYTWNTKEEAKQAF KELL 
KBKRVPSNAS WEQAMKMI INDPRYS AIiAKLS EKKQAFNAY KVQT 
EKK 


6334 


. 17 


644 


GGNPSGRAAGFAAAAMPSS PLRVAWCSSNQNRSMEAHN 1 L&KR 
GFSVRSPGTGTHVKLPGPAPDKPNVYDFTCmDQMYlTOLLHKDK 
ELYTQNGIIiHMLDRNKRIKPRPBRFQNCKDLPT)LILTCEESVYD 
QWED LNS REQETCQP VHWNVDI QDNHEBATLGAFL I CE LQQC 
IQHTEDMENE IDELLQEFEEKSGRTFLHTVCP Y 


6335 


82 


529 


AARARPGVLCCRIiLGAALGDQSRVEMSYIPGQPVTAVVQRVEIH 
KLRQGENLILGPSIGGGIDQDPSQNPFSBDKTDKGIYVTRVSEG 
GPAEIAGLQIGDKI^VNGWDMTMVTHDQARKRLTKRS EEWRL 
LVTRQSLQKAVQQSWLS ! 


6336 


10D3 


438 


HEPASKGRAEVGNMRLSVAAAISHGRVPRkMGLGPESRIHLLRN 
LLTGLVRHERI EAP WARVDE MRG YAE KL ID YG KLGDTNERAM RM 
ADFWLTEKDLI PKLPQfVtiAPRYXDQTGGYTRMLQ IPNRSLDRAK 
MAVIEYKGNCLPPLPLPRRDSHLTLLNQLLQGLRQDLRQSQEAS 
NHSSHTAQT PG I 


6337 


76 


524 


EG I QMLS VQPDTKPKGCAGCNRKI KDRYLLKALDKYWHEDCLKC ' 
ACCDCRliGEVGSTLYTKANLILCRRDYLRLFGVTGNCAACSKLI 
PAFEMVMRAKDNVYHLDCFACQLCNQRFCVGDKFFLKNNMI LCQ 
TDYBEGLMKEGYAPQVR 


' 6338 


66 


1349 

r 


APySESGTQGPLPTPANLFWTRPJWPDPTTSMSATDRMGPKAVp-- 
GliRLAI^LLLGLGTPKSGVQGQEGLDFPEYDGVDRVINVNAKNY 
Kl^FKKYEVLALLYHEPPEDDKASQRQFEMEELILEIiAAQVIiED 
KGVGFGLVDSEKDAAVAKKLGLTEVDSMYWKGDEVTEYDGEFS 
ADTIVEFLI^DVLEDPVEiaEGERELQAFENIEDEIKLIGYFKSK " 
DSEH YKAFEDAAEEFHPY I P FFATFDS KGAKKLTLKLNE I D FYE 
AFMEEPVTI PDKPNSEEE JVNFVEEHRRSTLRKLKPESMYETWE 
DDMDG IH I VAFAEE ADPDGFE FLETLKAVAQDNTENP DLS X X W I 

DPDDFPLLVPYWEKTPX)IDLSAPQIGVVNVTDADRLWMEMDDEE 
DLPSAE ELEDWIiEDVLEGE INTEDDDDDDDD 


6339 


246 


1813 


NRCDRGGGGQABRQAGQGCRTQGAGPGFGFGHS FFSQGAMKAFfT" 

TFCNATLLVFXSSVSEAKFDDFEDEEDI VE YDDNDFAEFEDVMEDS 

VTESPQRVIITEDDEDETTVELEGQDENQEGDFEDADTQEGDTE 

SEPYDDEEFEGYEDKPDTSSSKNKDPITIVDVPAHLQNSWESYY 

LEII^VTGLIAYIMNYIIGKNKNSRLAQAWFNTHREIJJESNFTL 

VGDDGTNKEATSTGKIJIQENEHIY11LWCSGRVCGEGMLIQLRFL 

KRQDLLNVLARMMR PVSDQVQ I KVTMNDEDMDTYV FAVGTRKAL 

VRLQKEMQDLSEFCSDKPKSGAKYGLPDSLAII^EMGEVTDGMM 

DTKMVHFLTHYADK1ESVHFSDQFSGPKIMQEEGQPLKLPDTKR 

TLLLTFNVPGSGNTYP KDMBALLPU4NMVX YS I DKAKKFRLNRE 

GKQKADKNRARVEENFLKLTHVQRQEAAQSRREEKKRAEKBRIM 

NEEDPEKQRRLEEAALRREQKKLEKKQMKMKQIKVKAM 


^340 


2 


583 


EACAHTLSCPAFARIjGRARRRPWMSHRTSSTFRAERSFHSSSSS 
SSSSTSSSASRALPAQDPPMEKALSMFSDDFGSFMRPHSBPLAF 
PARPGGAGNIKTLGDAYEFAVBVRDFSPEDIIVTTSNNHIEVRA 
EKLAAIX5TVMNNFAHKCQLPEDVDPTSVTSALREDGSLTIRARR 
HPHTEHVQQTFRTEIKI 


6341 


2 


645 


KMAVLSAPGLRGFR I LGLRSSVG PAVQARGVHQSVATDGPS STQ 
PALPKARAVAPKPS S RGB YWAKLDDL VNWARRSS LW PMTFQLA 
CCAVEMMHMAAPRYDMDRFGVVFRASPRQSDVMIVAGTLTNKMA 
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ID 

NO: 


Predicted ~" 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
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to first 
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residue of 
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»w*v* acvjiiiciAw Lontain nig sxgnax peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, NeAgparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S*Serine, T=Threonine, VsValine, 
W«Tryptophan, Y«Tyrosine, X«= Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








PALRKVYDQMPEPRYVVSMGSCANGGGYYHYSYSVVRGCORIVP 
VDIYIPGCPPTAEALLYGILQLQRXIKRBRRLQIWYRR 


6342 


2 


1191 


DPRVRAMLATIjARVAAIJUCTCLPSGRGGGRGLWTGRPQSDMNNI 
KPLEGVKI LOLTRVLAG P FATMNLGDLGAEVI KVER PG AG DDTR 
TWGP PFVGTESTYYLS VNRNKKS IAVNIKDPKGVKI 1 KBLAAVC 
D VFVENYVPGKLSAMGIX3 YED IDE IAPH 1 1 YCS ITG YGQTGP I S 
ORAGYDAVASAVSGLMHITGPBVACLSHIAANYLIGQKEAKRWG 
i Attvjt) A vt" x uafktkia>yiwglagnnqqfatvckild 

S KYKTNHLRVHNRKEL I KI LS ERFEE ESIrTS KWLYLFEGSGVP YG 
P INNMKNVFAEPQVL^GLVMEMEHPTVGKIS VPGPAVRYSKFK 
MS15ARPPPLLGQHTTHILKEVLRYDDRAIGELLSAGVVDQHETH 


6343 


2 


936 


GTAMVSDEDkLNLLVIWDANPIWWGKQALKESQPTLSKCIDAV 
MVLGNSHLFMNRSNKLAVIASHIQESRFLYPGKNGRLGDFF3LP 
GNPPEFNPSGSKDGKYELLTSANEVIVEEIKDLMTKSDIKGQHT 
ETLUAGSLAKALCYIHRMNKEVKDNQEMKSRILVIKAAEDSALQ 
YMNFMWIFAAQKQNILIDAC^^DSDSGLLQQACDITGGLYIjKV 
PQMPSLLQYUL.WVFLPDQDQRSOLILPPPVHVDYRAACFCHRNL 

IBIGYVCSVOiSIFCNFSPICTTCErAFKlSLPPVLlCAKXKKLK 
VSA 


6344 
""S345 


2508 


147 

r 


TMPTATLGNIiRG YGMAS PGLAAPS LTP PQLAT PNLQQFFPQATR 
QSLLGP PPVGVPMNPSQFNIiSGRNPQKQARTSS STTPNRKDSSS 
QTMPVEDKSDPPEGSEEAAEPRMDTPEDQDLPPCPEDIAKEKRT 
PAPEPE PCEAS EI* PAKRLRS SEE PTE KEPPGQLQVKAQPQARMT 
VPKQTQTPDLL PEALEAQVLPRFQPRVLQVQAQVQS QTQPR I PS 
TDTQ VQ PKLQ KQAQTQTS PEHLVLQQKQVQPQLQQ2AE PQKQVQ 
PQVQPQAHSQGPRQVQLQQEAEPLKQVQPQVQPQAHSQPPRQVQ 
LQLQKQVQTQT YPQVHTQAQPS VQ PQEHPPAQVS VQPPEQTHEQ 
PHTQ PQVSLLAPEQT P VWHVCGLEM PP DAVEAGGGMEKTL PE P 
VGTQVSMEEIQNE5AC^LDVGECENRAREMPGVWGAGGS LKVTI 
LQSSDSRAFSTVPLTPVPRPSDSVSSTPAATSTPSKQALQFFCY 
I CKAS CS SQQEFQDHMSE PQHQQRLGEI QHMSQACliliSLLP VPR 
DVLETEDEEPPPRRWCNTCQLYYMGDLIQHRRTQDHKIAKQSLR 
PFCTVC3TOYTKTPRKFVEHVTCSQGHKDKAKE1LKSLEKEIAGQDE 
DHFITVDAVGCFEGDEEEEBDDEDEBEIEVEEBLCKQVRSRDIS 
REEW KGS ETYS PNTAYG VDFLVP VMG Y I CRI CHKFYHSNSGAQL 
SHCKSLGHFENLQKYKAAKNPSPTTRPVSRRCAINARNALTALF 
TSSGRPPSQPNTQDKTPSKVTARPSQPPLPRRSTRLKT 




2 


3483 


PRVRTKLI LLVMDKKRYERVG GGPKRLGRDVEMEEM I EQLQEKV 
HELEKQNDTLKNRLI SAKO^LQTQGYRQTPYNNVQSRINTGRRK 
ANENAGLQECPRKG I KFQDADVAETPHPMFTfCYGNSLLEEARGE 
IRNLBNVIQSQRGQIEELEHIiAEI LKTQLRRKENEIELSLLQIiR 
EQQATDQRSNIRDNVEMIKLHKQLVEKSNALSAMEGJCFIQLQEK 
QRTLKISHDAU^GDELNMQLKEQRl^CSLEKQLHSMKFSER 
RI EELQDRINDLBKERELLKENYDKLYDSAFSAAHEEQWKLKEQ 
QLKVQIAQLETALKSDLTDKTEIIiDRLKTERDQNEKIjVQENREL 
QLQYLEQKQQLjDBLiCKRIKLYNQENDINADEIiSEALLLIKAQKE 
QKNGDLSFLVKVDSEINKDLBRSMRELQATHAETVQELSKTRNM 
LIMQHKINKDYC^EVEAVTRKMENLQODYEIjKVEQYVHLLDIRA 
ARrHKLEAQLKDIAYGTKQYKFKPEIMPDDSVDEFDETlHLERG 
ENLraiHINKmSSEVLQASGDKEPVTFCTYAFYDFELQTTPV 
VRGLHPE YNFTSQYLVHVNDL FLQYIQKNTITLEVHQAYSTEYE 
TIAACQLKFHEILEKSGRIFCTASLIGTKGDIPNFGTVEYWFRL 
RVPMDQAIRLYRERABCALGYITSNFKGPEHMQSLSQQAPKTAQL 
SSTDSTDGNLNELHITIRCOraLQSRASKLQPHPYVVYKFFDFA 
DHDTAI I PS SNDPQFDDHMYFPVPMNMDLDRYLKSBSLS FYVFD 
DSDTQENIYIGKVNVPLISLAHDRCISGIFELTDHQKHPAGTIH 
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NO: 
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nucleotide 
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corresponding 
to first 
amino acid 
residue oc 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(AaAlanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, RoArginine, 
S=Serine, T=Threonine, V»Valine, 
WoTryptophan, Y-Tyrosine, X»Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








VILKWKFAYLPPSGSITTEDIiGNFIRSBEPEWQRLPPASSVST 
LVLAPRPKPRQRLTPVDKKVS FVD IMPHQSDVSQEGSVDEVKEN 
TEKMQQGKDDVSLLSEGQLAEQSLASSEDBTEITEDLBPEVEED 
MSASDSDDCI I PGPISKNI KQPSEKIRIEIIALSLNDSQVTMDD 
TIQRLFVECRFYSLPAEETPVSLPKPKSGQWVYYNYSNVIYVDK 
ENNKAKRDILKAILQKQEMPNRSLRPTWSDPPEDEQDLECEDI 
GVAH VDLADMFQEGRDL I EQNID VFDARADGEG IGKLR VT VEAL 
HALQSVYKQYRDDLEA 


6346 


2321 


533 


QDRRLLRI£LQKTCQPTSTMSGSHTPACGPPSAi,TPSIWPQEIlT~ 
AKYTQKEESAEQPEFYYDEFG PRVYKBEGDEPGSSLLANS PLME 
DAPQRLRWQAHLEFTHNHDVGDLTWDKIAVSLPRSEKLRSLVLA 
GIPHGMRPQLWMRLSGALQKKRNSELSYRBIVXNSSNDET IAAK 
QIEKDLLRTMPSNACFASMGS IGVPRLRRVLRALAWLYPE IGYC 
QGTGMVAACLLLFLEEEDAFWMMSAI IEDLLPASYFSTTLLGVQ 
TDQR VLRHL IVQYL PRLDKLLQEHD I ELSLITLHWFLTAFASW 
DIKLLLRIWDLFFYEGSRVLFQLTLGMLHLKEEELIQSENSASI 
FNTLSDIPSQMEDAELI^VAMRLAGSLTDVAVETQRRKHLAYL 
IADQGQLLGAGTLTNI^QWRRRTQRRKSTITALLFGEDDLEAL 
KAKNI KQTEL VAD LR EAI L R VARH FQ CTD PKNGS WS RQLPG LL 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENYVACSRSH 
RRRAKALLDFERHDDDELGFRXNDI ITIVSQKDEHCWVGEIiNGL 
RGWFPAKFVEVIiDERSKEYS IAGDDSVTEGVTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRliVL 
CKTFRLDEDGKVLTPBELLYRAVQSVNVTHDAVHAQMDVKLRSL 
I CVGLNEQVLHLWLEVLCSSL PT VEKWYQPWS FLRS PGWVQI KC 

BLRVLCCFAFSI£QDWEI*PAKREAQQPLKEGVRDMLVXHHLFSW 
DVDG 


634? 


2921 

• 


533 


QDRRIJiRLELQKTCQPTSTMSGSHTPACGPFSALTPSIWPQEIL 
AKYTQKEESAEQPEFYYDEFG FRVYKEEGDEPGSSLLANSPLME 
DAPQRXjRWQAHLEFTHNHDVGDLTWDKIAVSLPRSEKLRSLVIA 
GIPHGMRPQLWMRLSGALQKKRNSELSYREIVKNSSNDETIAAK 
Q I EKDLLRTMPSNACFASMQS I G VPRLRRVLRALAWL YPEI G YC 
QGTGMVAACLLLFLEEEDAFWMMSAI IEDLLPASYFSTTLLGVQ 
TDQRVLRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAFASVV 
DI KLLLR I WDL FFYEGSRVLFQLTLGMLHLKEEEL I QSENSAS I 
FNTLSDI PSQMEDAELLLGVAMRLAGSLTDVAVETQRRKHLAYL 
IADQGQLLGAGTLTNLSQVVRRRTQRRKSTITALLFGEDDLEAL 
KAKNI KQTEL VADLR EAI LRVARHFQ CTD PKNCSVVSRQLPGLL 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENYVACSRSH 
RRRAKALLDFERHDDDELGFRKNDIITIVSQKDEHCWVGELNGL 
RGWFPAKFVEVLDERSKE YS IAGDDSVTEGVTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLT PEELLYRAVQSVNVTHDAVHAQMDVKIjRS L 
I CVGLNEQVLHLWLEVLCSSLPTVEKWYQ PWSFLRS PGWVQIKC 

ELRVLCCFAFSLSQDWELPAKREAQQPLKEGVRDMLVKHHLFSW 
DVDG 


6348 


3 


3679 


AGAEKCFVTLLACFLAKQQNKYKYEECKDLIKSMjl*RNELQFKBE " 
KIiAEQLKQAEELRQYKVLVHSQERELTQLREKIiREGRDASRSLN 
EHLQALLTPDE P DKS QGQDLQEQLAEGCRLAQHL VQKLS PENDN 
DDDEDVQVEVAEKVQKSSSPRBMQKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
EKKQQFRNLKEKCFLTQLACFLANQQNKYKYEECKDLIKFMLRN 
E RQ FKE EKLAEQLKQAEELRQY KVLVHSQERELTQLREKLREGR 
DASRS IiNEHLQAIiLTPDEPDKSQG QDLQEQIiAEGCRLAQHL VQK 
LS PENDNDDDED VQVE VAEKVQKS S APREM P KAEEKE VP EDS LB 
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ID 
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to first 
amino acid 
residue of 
amino acid 
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1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acxd segment containing signal peptiHeH 
(A»Alanine, CoCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, | 
L=Leucine, M=Methionine, N»Asparagine, 
P=Proline, Q=Glut amine, R=*Arginine, 
SaSerine, T«Threonine, VoValine, I 
WaTryptophan, Y-Tyrosine, X=Untaiown # *=»Stop 
Codon, /-possible nucleotide deletion, ~ j 
\apossible nucleotide insertion) j 








ECAlTCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSkVEvH 
EDAVHIIPENESDDEEEEEKGPVSPRNLQESEBEEVPQESWDEG 
YSTLSIPPEMLASYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK 
KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 
CQPYRSAFYVLEQQRVGLAVNMDEIEKYQEVEEDQDPS CPRLSR 

ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGOPYSSAVYSLEB 
QYLGLALDVDRI KKDQE EE EDQG P PCPRLS RELLEWE P E VLQD 

SLDRCySTPSSCIiEQPDSCQPYGSSFYALBEKHVGFSLDVGElB 
KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 
PE VLQDSLDRC YS TPSG CLELTDSCQ P YRSAF YILEQQRVGLAV 
DMDE IEKYQEVEEDQDPSCPRLSGELLDEKEPE VLQESLDRCYS 
TPSGCLBLTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 
DQDPS CP RLSRELLDEXEPE VLQ DS LGRCYS TPSGYLELPDLGQ 

PYSSAVYSLEEQYLOLALDVDRIKKDQEEEEDQGPPCPRiSREL ' 
LEWEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH 
VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGBEDQNPPCP 
RI^SMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVPY 
S FEEEHIS FALYVDNR FFTLTVTSLHLVFQMGVI F PQ " j 


6349 


3 

» 


3*79 


AGAKKCFVTLIACFLAKQQNKYKYEECKDLIK3MLRNE1»QFKEE — I 
KLAEQLKQ AEELRQ YKVL VHS QER E LTQLREKLREGRDAS RS LN 

EIII^AIiIjTPDEPDKSQGQDLQEQLAEGCRLAQHLVOFCLSPENDN 
DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLliECAITCS 
NSHGPCDSNQPHKNIK1TFEEDEVNSTLWDRESSHDECQDALN 
ILPVPGPTSSATNVSMWS AGPLS GEKAAIN ILE INEKLRPQLA 
EKKQQFRNLKEKCFLTQLACFLANQQNKYKYEECKDLIKFMLRN 
ERQFKEEKLAEQLKQAEBLRQYKVLVHSQBRELTQLREKLREGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRXiAQHIiVQK 
LSPENDNDnDEDVQVEYAEKVQKSSAPREMPKAEEKEVPEDSIaE 
ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVHI IPENESDDEEEEBKGPVSPRNLQESEEEEVPQES WDEG 
YSTLSIPPEP1LASYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK 
. KEDHEATGPRLSRELLDEKGPEVIiQDS LDRCYSTPSGCLELTDS 
(X)PYR5AFYVLEQQRVGIiAVNMDE I EKYQEVEEDQDPS CPRLSR 
ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 
QYLGLALDVDRI KKDQEEEEDQGPPCPRLSRELIdffVVEPEVLQD 
SLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVGFSLDVGEIE 

kkgkgkkrrgrrskkerrrgrkegeedqnppcprlsrelldbkg 

PEVLQDS LDRCYSTPSGCLE LTDS CQP YRSAFY ILEQQRVGLAV 
DMDEIEKYQEVEBDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 
TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 
DQD PSCPRLSRELLDEKEPBVLQDSLGRCYSTPSGYLELPDLGQ 
P YS S AVYSLEEQ YLGLALDVDR t KKDQEEEEDQGPPCPR LSREL 
LEWEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH 
VGFSLDVGE I EKKGKG XKRRGRRS KKERRRGRKEGB EDQNP P CP 

RLNSMLMEVEEPEVIiQDSLDICYSTPSMYFELPDSFQHYRSVFY | 

SFERRHTSPJVT.WTWQWTT'PT TlffOT trr irc>nvrr<T r-r rtrm 

oroawiior/uji vjjiiKr r xijXv I aLtiiiVryMG V Z r PQ { 


6350 " 


3 


3679 ■ ; 


AGABKCFVTLIiACFLAKQQNKYKYEECKDLI KSMLRNELQFKEEI 
KIAEQLKQAEELRQYKVLVHSQERELTQLRBKLREGRDASRSLN 
EHLQALLTP DEPDKS QGQDLQEQLAEGCRLAQHLVQKLS P ENDK 
DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 
NS HGPCDSNQPHKNI K I TFE EDE VNS TLWDRESSHDECQDALN 
I L P VPGPTS S ATNVSM WSAGP LSGEKAAINI LB INEKLRPQLA 
EKKQQFRNLKEKCFLTQLACFLANQQNKYKYEECKDLIKFMLRN 
ERQ FKEE KLAEQLKQAEELRQ YKVLVHSQERELTQLRB KLREGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 
LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 
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to first 
amino acid 
residue of 
amino acid 
sequence 



"Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide ' 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H«Histidine. I=Isoleucine, Kabysine, 
L=Leucine, Methionine, N=Asparagine, 
P=Proline, Q*Glutamine, R«Arginine, 
S=Serine, T= Threonine, VoValine, 
W=Tryptophan, Y-Tyrosine, XoUnJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 



~63ST 



6352 



"Sls- 



1SCAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVHI I PENESDDEEEEEKGP VSPRNLQES EEEEVPQES WDEG 
YSTLS IPPEMLAS YKS YSSTFHSLEEQQVCMAVDIGRHRWDQVK 
KEDHEATGPRIiSRELLDEKGPEVLQDSLDRCYSTPSGCLEIiTDS 
CQPYRSAPYVLEQQRVGLAVKMDEIEKYQEVEEDQDPSCPRLSR 
ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 
QYLGLALDVDRIKKDQEEEEDQGPPCPRI*SRELI*EVVEPEVLQD 
SLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVGFSLDVGEIE 
KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 
PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 
DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 
TPSGCLELTDS CQPYRSAFYI LEQQRVGLAVDMDEI EKYQEVEE 
DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 
PYSSAVYSLBEQYIXaLALDVDRIKKDQBEEEDQGPPCPRIiSREL 
LE WEPEVLQDSLDRC YS TPS S CLEQPDS CQPYGS SFYAI* EEXH 
VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGBEDQNPPCP 
RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
S FEE EH 1 3 FALYVDNRF FTLTVTSLHLVFQMG VI F PQ 



"319 I »UARiuCTERflQi ^ 

1 RTVGALPRGPRQNSRLGLPIiLLMPEEARLLAE IGAVTLVS APRP 
DSRHHSLALTSFKRQQEESFQEQSALAAEARETRRQELLEKITE 
GQAAKKQKLEQASGASSSQEAGSSQAAKBDETSDGQASGEQEEA 
| GPSS SQAGPSNGVAPL PRSAIiliVQLATARPR P VKARPLDWRVQS 
KDWPHAGRPAHELRYSIYRDLWERGFFLSAAGKFGGDFLVYPGD 

PLRFHAHY1AQOTAPEDTIPLQDLVAAGRLGTSVRKTLLLCSPQ 
PDGKWYTSLQWASLQ 



"92T 



TIT 



WS B WLS PCHAAKCKGLSMbRI TMKTRAI 5LAADATE FVQGRSAP 
AMARSLVHDTVFYCLSVYQVKISPTPQLGAASSAEGHVGQGAPG 
LMG1JMOTEGGVNHENGMNRIX5GMIPEGGGGNQEPRQQPQPPPEE 
PAQAAMEGPQPENMQPRTRRTKFTLLQVEELES VFRHTQ Y PDVP 

TRRELAEOT^GVTEDK^VWFKNKRARCRRHQRELMLANELRADP 
DDCVYIWD 



R^AGAGA^PEA RARPP&VQAAEEEKEMDI^DSASR\^CQRIL5M 
VNTDDWAIIIJiQKNMLDRFEKTNEMLLNFNNLSSARLQQMSER 
FiaHTRTLVEMKRDLDSIFRRIRTLKGKLARQHPEAFSHIPBAS 
FLEBEDEDPIPPSTTTTIATSEQSTGSCDTSPDTVSPSLSPGFE 
DLSHVQPGSPAINGRSOTDDEEMTGE 



510 



-635T" 



158 



PSLRPMEPTRDCPLyGQAFSAJt LPWGAIDVSDLfePVPD^QEVFC 
HPVTDQSLIVELLELQAHVRGEAAARYHFEDVGGVQGARAVHVE 

SVQPLSLENIJaRGRCQBAWVl^GKQQlAKENQQVAKDVTLHQA 
LLRLPQYQTDLLLTFNQPP 



"6355 



354 



TJT" 



RGSSAAFRGSGLRGAMIRRVLPHGMGRGLLTRRPGTRRGGFSLD 
WDGKVSEIKKKIKSILPGRSCDLIjQDTSHLPPEHSDWIVGGGV 
LGLS VAYWIiKKLES RRGAIRVLWERDHTYS QASTGLS VGG I CQ 
QFSLPEN I QLSLFS AS PLRN INE YtAWDAP PLDLRFNPSG YLL 
LAS EKDAAAMESNVKVQRQEGAXVSLMS PDQLRNKFPWINTEGV 
AIJISYGMEDEGVJFI>PWCXIjQGIJUIKVQSIjGVLFCO^EVTRFVSS 
SQRMLTTDDKAVVLKRIHEVHVKMDRSLEYQPVECAIVINAAGA 
WSAQIAALAGVGEGPPGTLQGTKtiPVEPRKRYVYVWHCPOGPGL 
ETPLVADTSGAYFRREGLGSNYLGGRSPTEQEEPDPANLEVDHD 
FFQDKVW PHLALRVPAFETLKVQSAWAG YYDYNT FDQNGWG PH 

PLWNMYFATGFSGHGLQQAPGIGRAVAEMVLKGRFQTIDLSPF 
LFTRFYLGEKIQENNII 

I TGLTSS CL PLQ VMMTKRTKDMGKFSS VT VST I DEEEEE I BARE V" " 
I ADS YAQNAKVI EKQLERKGMS KRRLQELAELEAKKAKMKGTLID 
! NQFK 
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ID 
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| Predicted ~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A-Alanine, C=Cysteine, D-Aepartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=»Asparagine , 
r-MQiinei v^tjJLucamine , R«Arginine, 
SoSerine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\t=possible nucleotide insertion) 


*357 


2 


915 


GLLRNMALLVRVLRNQTS I SQWVP VCSRLI PVSPTQGQGDRALS 
RTSQWPQMSQSQACGGSEQIPGIDIQLNRKYHTTRKLSTTKDS P 
QP VEEKVGAFTJCI IBAMGFTGPLKYS KWKIKI AALRMYTS CVEK 
TDFEE FFLRCQMPDTFN S WFL I TLLHVWMCLVRMKQ EGRSGKYM 
CRI IVHFMWEDVQQRGRVMGVNPYI LKKNMILMTNHFYAAILGY 
DEG ILSDDHGLAAALWRTFFNRKCEDPRHLELLVEYVRKQ IQYZ* 
DSMNGEDLLLTGEVSWRPLVEKNPQSILKPHSPTYNDEGL 


6358 


2009 

• 


1040 


ASDALHSLSAPYLRLSSRSAARPATMTEQAISFAKDFLAGGIAA 
AI S KTAVAP I ER VKLLLQ VQHAS KQ I AAD KQYKG IVDC I VRI P K 
EQGVLS FWRGNLANVI RYFPTQALNFAFKDKYKQI FLGGVDKHT 
QFWRYFAGNLASGGAAGATSLCFVYPLDFARTRLAADVGKSGTE 
RB FRGLGDCLVK I TKSDG I RGL YQG FSVSVQG 1 1 IYRAAYFQVY 
DTAKG MLP D PKN TH I WS WMIAQT VTA VAG WS Y PF DTVRRRMM 
MQS GRKGAD IMYTGTVD CWRKI FRDEGGKAFFKGAW SNVLRGMG 
GAFVLVLYDELKKVT 


6359 


98 


1086 


VCRQEEEKMKEDCLPSSHVPISDSKSIQKSEIiLGLLKTYWCYHE 
GKS FQLRHREBEGTL 1 1 EGLLN3 AWGLRRP IRLQMQDDREQ VHL 
PSTS WMPRRPSCPLKEPS PQNGNITAQGPS IQPVHKAESSTDSS 
GPLEEAEEAPQIiMRT KS DAS CMS QR RP KCRA PGE AQR I RRHRFS 
INGHFYNHKTS VFTPAYG S VTNVRVMSTMTTLQVLTLLLN KFRV 
EDGPSEFALYIVHESGERTKLKDCEYPLISRILHGPCBKIARIF 
IjMEADLGVEVPHEVAQYIKFEMPVLDSFVEKLKEEEEREI XKLT 
MKFQALRLTMLQRUBQLVEAK 


6360 


1 


345 


GTRGAVPSTLEEVVIjPPRSCRVFW IHSGTTMSKVS FKITLTSDP " 
RLPYKVLSVPESTPFTAVIiKFAAEEFKVPAATSAIITNDGIGIN 
PAQTAGNVFLKHGSELRI IPRDRVGSC 


6361 


615 


158 


RPGLGQLQHCAIAPQAGNRRCRFHGRLHALTRSTHRGKPMSIMQ " " 
FKDTLNTPLPDSS PVAVPLGAPIAVASTLSVEHNDGVETGI WAC 
APGRWRRQITSQEFCHFIQGRCTPTPDDGETLHIQAGDAIiMLPA 
NSTGIWDIQETVRKTYVLIL 


6362 


350 


1576 


riPUAiSHSAAUOiQQLPPTS SSSAVS EAS FSYKENLIGALLAI F 
GHLWS IALNLQKYCHI RliAGSKDPRAYFKTKTWWLGLFLMLLG 
E LGVF AS YAFAPLSLIVPLSAVS VIASAI IGI IFIXEKWKPKDF 
LRR YVLS FVGCGLAWGTYLL VTFAPNS HEKMTGENVTRHL VS W 
PFLL YMLVE I ILFCLLL YFYXEKNANN I WILLLVALLGSMTW 
TVKAVAGMLVLS I QGNLQLD YP I FYVMFVCMVATAVYQAAFLS Q 
ASQMYD9SLIASVGYIIjSTTIAITAGAIFYLDFIGEDVLH1CWF 
ALGCLIAFI/3VFL I TRNRKK? I PFEPYI SMDAMPGMQNMHDKGM 
TVQPBLKASFSYGALENNDNISEIYAPATLPVMQEEHGSRSASG 
VPYRVLEHTKKE 


6363 


21 1 


1201 


RRTRLGS S FPRRRDS SAMES YDV IANQ P WIDNGSGVIKAGFAG 
DQI PKYCFPNYVGRPKHVRVMAGALEGD I FIGP KAEEHRGLLS I 
RYPilBHGIVKIJWIJDMERIWQYVYSKDQLQTFSEEHPVIjIiTEAPIj 
N PRKNRERAAE VFFETFNVPAL FI S MQAVLSL YATGRTTGWZiD 
SGDGVTHAVPI YEGFAMPHS I MRIDIAGRDVSRFLRLYLRfCEGY 
DFHSSSEFEIVKAI KERAC YLS INPQKDETLETEKAQYYLPDGS 
TIEIGPSRFRAPELLFRPDLIGEESEGrHEVLVFAIQKSDMDLR 
RTLFSNIVLSGGSTLFKGFGDRLLSEVKKLAPKDVKIRISAPQE 
RL YSTWIGGS ILAS LDTF KKM WVS KKE YEBDGARS IHRKTF 


£3fT4 " 


21 


1201 


RRTRLGS S F PRRRDS 3 AM ES YDVI ANQP WI DNGSGVI KAGFAG 
DQIPKYCFPNYVGRPKHVRVMAGALEGDIFIGPKAEEHRGLLSI 
RYPMEHGIVKDV^MERIWQYVYSKDQLQTFSEEHPVLLTEAPL 
NPRKNRERAAEVFFETFNVPALPISMQAVLS LYATGRTTGWLD 
SGDGVTHAVPI YEG FAMPHS IMRIDIAGRDVSRFLRLYLRKEG Y 
DFHSSS E FE IVKAI KERAC YLS INPQKDETLETEKAQYYLPDGS 
TIEIGPSRFRAPELLFRPDLIGEESEGIHEVLVFAIQKSDMDLR 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D=Aspartic Acid, E=« 
Clutamic Acid, F*Phenylalanine, G=Glyeine, 
H=Histidine, Idsoleucine, K&Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PnProline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, VaValine, 
WsTryptophan, Y a Tyrosine, X=Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\opossible nucleotide insertion) 








RTLFSNIVLSGQSTLPKGPGDRLLSBVKKLAPKDVKIRISAPQE 
RL YS TW IGGS ILASLDT FKKMWVS KKEYEBDGARS IHRKT F 


6365 


234 


1989 


KHKSRASCAARAQAFGPSREREVHSRFRSGLRRLGESNSGCCTM 
AS MGTLAFDE YGRP PL 1 1 KDQDRKSRLMGLEALKSHIKAAKAVA 
NTMRTSLGPNGLDKKMVDKDGDVTVTNDGAT2LSMMDVDHQ I AK 
LMVELS KSQDDE I GDGTTGWVLAGAIiLEEAEQLLDRG I HP IRI 
ADGYEQAARVAI EHLDKI SDSVLVDIKDTEPLIQTAKTTLGSKV 
VKS CHRQMAE I AVNAVLT VADMB RRD VD FEL I KVE GKVGG R L ED 
TKLIKGVTVDKDFSHPQMPKKVEDAKIAILTCPFEPPKPKTKHK 
LDVTSVEDYKALQKYRKEKFE^IQQIKETGANI*AICQWGFDDE 
ANHLLLQNNLPAVRWVGGPEIELIAIATGGRIVPRFSELTAEKL 
GFAGLVQEISFGTTKDKMLVIEQCKNSRAVTIFIRGGNKMIIEE 
AKRSLHDALCVTRNLIRDNRWYGGGAAEIS CALAVSQEADKCP 
TLEQYAMRAFADALEVI PMAUSENSGMNPIQTMTEVRARQVKEM 
NPALGIDCLHKGTNDMKQQHVIETLIGKKQQISLATQMVRMIIiK 
IDD I RKPGE SEE 


6366 


| 257 


1B9B 


GNKEGAHSSTFW VLLS I KLGAVAMLCKEQGI TVLG LNAVFD it»V 
IGKFNVLEIVQIC^I^KDKSLEmGMIJWGGIJjFRMTLLTSGGAG 
MLYVRWRIMGTGPPAFTEVDNPAS FADSMLVRAVNYN YY YSLNA 
WLLLCPWWLCFDWSMGCIPLIKSISDWRVIALAALWFCLIGLIC 
QALCSEDGHKRR I LTLGLGFLVIP FLPASNLFFRVGFWAERVL 
YJjPS VGYCVLLTFGFQALS KHTKKKKLI AAVVLG ILF I NTLRCV 
LRSGEWRSEEQLFRSALSVCPLNAKVHYNIGKNLADKGNQTAAI 
RYYREAVRIxNPKYVHAMNNLGNXLKERNELQEAEELLSLAVQIO 
PDFAAAWMNLG I VQNSLKRFEAAEQS YRTAI KHRRKYP DC YYNL 
GRL YADLNRH VDALNAWRNATVLKPEHSLAWNNM I 1 LLDNTGNL 
AQAEAVGREALELIPNDHSLI4FSIANVLGKSQKYKESEALFLKA 
I KANPNAAS YHGNLAVL YHRWGHLDLAKKHYEI SLQLDPTASGT 
KE NYGLLRRKLELMQ KKAV 


6367 


287 


1934 


SIGFPVMLVLSILLYTCEMFQDSVAFEDVAVSFTQEEWAXjLDPS 
QKNLYRD VMQETFKNLTS VGKTWKVQWI EDEYKNPRRNLSLMItE 
KLCES KESHHCG ES FNQ IADDMLNRKTL PGITPCESS VCGEVGT 
GHSS LNTHI RADTGH KSS B YQE YGENP YRNKECKKAFS YLD S FQ 
SHDKACTKEKP YDGKECTETF I SHSCI QRHRVMHSGDGP YKCKF 
CGKAFYFLNLCLIHERIHTGVKPYKCKQCGXAFTRSTTLPVHER 
TOTGWADECKECGNAFSFPSEIRRHKRSHTGEXPYECKQCGKV 
FISFS 3 1 QYHKMTHTGEKPYE CKQOGKAFROGSHLQKHGRTHTG 
EKPYECRQCGKAFRCTSDLQRHBKTKTEDKPYGCKQCGKGFRCA 
SQLQIHERTHSGEKPHECKECGKVFKYFSSLRIHBRTHTGEKPH 
S CKQCGKAFRYFS S LHI HERTHTGDKP YE CKVOG KAFTCSS S I R 
YHERTHTG EfG?YECKHCGKAF ISKYIRYHERTHTGEKP YQCKQC 
GKAFIRASSCREHERTHTINR 


636B 


1 


327 


RP VPAKLN PRS WPRTAGALPLRP PPLTMAVFHDEVEI EDFQYDE 
DS3TYFYPCPCGDNFSITKBDLENGEDVATCPSCSLI IKVTYDK 
DQ FVCGET VPAPS ANKELVKC 


6369 


1 


1745 


AGCCRDTRFPTPRGFGSU^FCRSAACTVl'RTIHGSPREDTGT 
PRSREMMFQDSVAFEDVAVSFTQEEWALLDPSQKNLYRDVMQET 
FKMLTSVGKTWKVQNIEDEYKKPRRNLSLMREKLCESKESHHCG 
ESFNQIADDMUJRKTLPGITP CESSVCGEVGTGHSSLNTHI RAD 
TGHKSSEYQEYGBNPYRNKECKKAFSYU>SFQSHDKAC7KEKPY 
DGKECTETFISHS CI QRHRVMHSGDGP YKCKFCGKAFYFliMLCL 
I HERIHTGVKP YKCKQCGKAFTRS TTLPVHERITTTGVNADE CKE 
CGNAFS FPS E IRRHKRSHTGEKPYE CKQCGKVFISFSS IQ YHKM 
THTGEKPYECKQCGIO\FRCGSHLQKHGRTHTGEKPYECRQCGKA 
FRCTS DLQRHEKTHTEDKP YG CKQ CGKG FRCAS QLQ I HERTHSG 
EXPHECKECGKVFKYFSSLRIHERTHTGEKPHECKQCGKAFRYF 
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SEQ 
ID 
NO: 


Predicted ~~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Es 
Glutamic Acid, F=Phenyl alanine, G-Glycine, 
.HaHietidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q*Glut amine, R**Arginine, 
S»Serine, T«Threonine, V»Valine, 
WoTryptophan, Y-»Tyrosine, X=Unknovm, *«3top 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








S S LH IHERTHTGD K P YECKVCGKAFTCS S S I RYHERTHTG EKP Y '" 
ECKHOGKA P I SNYIRYHERTHTGEKP YQCKQCGKAFI RAS S CRE 
HERTHTINR 


6370 


1711 


329 


F VLS EQRLKTE rtw prs pglgrgaaaagartagagllrll lgcg 

ALVGGLRPVTMTTPANAQNASKTWELSLYEliHRTPQEAIMDGTE 
I AVS PRS LHS E LM CP 1 CLDMLKnTMTTKB CLHRFCS DC I VTALR 
5GNKECPTCRKKLVSKRSLRPDPNFDALISKIYPSREEYEAHQD 
RVLZRLSRLHNQQALSSSIEEGLRMQAMHRAQRVRRPIPGSDQT 
TTMSGGBGEPGEGEGDGEDVSSDSAPDSAPGPAPKRPRGGGAGG 
SSVGTGGGGTGGVGGGAGSEDSGDRGGTLGGGTLGPP3PPGAPS 
PPEPGGEIELVFRPHPLLVEKGEYCQTRYVKTTGNATVDHLSKY 
! LALR I ALE RRQQQEAGE PGGPG GG AS DTGGPDG CGGEGGGAGGG 
DGPEEPALPSLEGVS EKQYTIY IAPGGGAFTTIiNGSLTIjE LVNE 
KFWKVSR PLELC YAPT KD P K 


6371 


3 


288 


GVANMSTAMNFGTKSFQPRPPDKGSFPLDKLGECKSPkEKFMICC " 

LHNNNFENALCRKESKEYLECIRMERKIjMLQEPLEKLGFGDLTSG 

KSEAXK 


63 72 


2141 


625" 


RVSAIASEGKAEERYKKLEDIiIiEKS PS LiVKMD QT.nbvvMrT/vsfw' ~~ 
LPKVPEKKLKLVMADKELYRACAVEVRRQIWQDNQALFGDEVSP 
LLKQYILEKESAIjFSTEIjSVLHNFFSPSPKTRRQGEWQRIiTRM 
VGKNVKLYDMV1^FLRTLFLRTRNVHYCTLRAELL.MSLHDLDVG 
EICTVDPraKFTWCIiDACIRFJ?FVDSKRAREI^FLDGVKKGQE 
QVLGDLSM I LCD PFAINTLALSTVRHLQELVGQETLPRDS PDLL 
LLLRLLALGQGAWDKIDSQVFKEPKMEVELZ TRFLPMLMS flvd 
DYTFKVDQKLPAEEKAPVSYPNTLPES ftkflqeqrmacevgly 
YVIiEITKQRNKNAI^RLLPGLVETFGDLAFGDIFLHIXTGITLAL 
LADEFALED FCS SLFIXjFFIjTAS PRKENVHRHALRLLIHLHPR V 
APSKLEALQKALEPTGQSGEAVKELYSQLGEKLEQLDHRKPSPA 
QAAET PALELPLPS VPAPAPL 


6373 


67 


711 


PSRAARAS PARL2AMVS W I XSRLWLI FGTLYPAYYSYKAVKSK 
DIKEYVKJ^YWIIFALFTTAETFTDIFIOIFPFYYEL^AFVA t 
WLLSPYTKGSSLLYRKFVHPTLSSKEKEIDDCLVQAKDRSYDAL 
VHFGKRGLNVAATAAVfiAASKGQGALSERLRSFSMQDLTTIRGD 
GAPAP SGPP P PG SGRASGKHGQPKMSRSASESASS SGTA 


6374 


535 


2105 


HKLFCSYISTSEFPSSTRHHSCPTHTFCNYTSSTIFLSSTRDHS 
CPTHTFCNYTSSTIFLSSTRDHSCPTHTSCNYTSSTIFLSSTRD 
HSCPTHTSCNYTSSTIFLSSTRDHSCPTHTFCNYPRPIIRLSSC 
CPABLQTEGSNGKKEVLSGFQWLEDTVLFPEGGGQPDDRGTIN 
DISVLRVTRRGEQADHFTQTPLDPGSQVLVRVDWERRFDHMQQH 
S GQHLI TAVADHLFKLKTTS WELGRFRS AI ELDTPSMTAEQVAA 
IEQSVNBKIRDRIiPVNVRELSLDDPEVEQVSGRGLPDDHAGPIR 
VVNIBGVDSNMCCGTHVSNLSDLQVIKILGTEKGKKNRTNLIFL 
SGNRVLKWMERSHGTE KALTALLKCGAEDHVEAVK KI iQNSTKI I» 
QKNNLNLLRDLAVHI AHS LRNSPDWGGW I LHRKEGDS EFMNI I 
ANEIGSE3TLLFLWGDEKGGGLFLLAGPPASVETLGPRVAEVI, 
EGKGAGKKGRFQGKATKMSRRMEAQALLQDYISTQSAKS 


6375 


1 


1535 


AIMAAATRPVRLPEAGCEGRERCWNPSRSRSHSGEGGLAAWSRT 
CPGRPRRPGQQWRG PTMLVTAYLAFVGLIiASCLGLELS RCRAK 
PPGRACSNPSFLRFQI^FYQVYFLALAADWLQAPYLYKLYQHYY 
FLEGQIAI LYVCGLASTVLFGLVASSLVDWLGRXNS CVLFSLTY 
SLCCLTKLSQDY FVLLVGRALGGLSTALLPSAFEAWY XHEHVER 
HDFPAEWX PAT FARAAFWNHVLAVVAG VAAEAVAS W IGLGPVAP 
FVAAI PljLALAGALAIjRNWGENYDRQRAFSRrCAGGLRCLLS DR 
RVLLLGTIQALFESVI FIFVFLWTPVLDPHGAPIjGI I FS SFMAA 
SLLGSSLYR I ATS KR YHLQPMHLLSLAVLI WFSLFMLTFSTS p 
GQESPVESFIAFLLIELACGLYFPSMSFLRRKVIPETEQAGVLN 
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SEQ 
ID 
NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

J sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acad segment containing signal peptide 
1 v»»uyaceine f D=Aspartic Acid, Esa 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L»Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RoArginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 








wfrvplhsiacxgllvi^sdrktgtrnmfsicsavmvmalDvv 
vglftwrhdaelrvpspteepyapel 


6376 


J 380 


1437 


ISSTDIDHYRFSFLVNSKMPSKESWSGRKTNRAAVHKSKQBGRQ 

ODI_ir*T2VAT«f3MVTy2COVOet7»PTCT/\?»T trr Ti»vnrir>nfiY - 

W^iJw A^vu^MiUAsb Pl^S VTIWQPLKLF^ 

NEQI PKYEKIHNFKVHTFRGPHMCEYCANPMWGLIAQG VKCADC 

GLNVHKQCSKMVPNDCK^DLKHVKKVYSCDLTTLVK^HTTKRPM 

WDMCI RE I ESRGLNSEGLYRVSGPSDLIEDVKMAFDRDGEKAD 

I S VNMYEDINI ITGAIjKLYFRDLPIPLITYDAYPKFIESAKIMD 

PDEQLET1*HEALKLLPPAHCETLRYLMAHLKRVTLHEKENLMNA 

ENLGIVFGPTLMRSPELDAMAALNDIRYQRLWELLIKNEDILF 


6377 


1 23 11 


184? 


SRI RRRS SRRPRBP PGPSRRRRRRRP0PRTMPSEKTFKQRRTFE 
QRVEDVRL I REQHPTKI P VI I ERYKQEKQLP VLDKTKFLVPDHV 
NMS E L I K 1 1 RRRLQLNANQAF FLLVNGHSMVS VSTP 1 S EVYESE 
KDEDG FL YM VYASQE TFGMKLS V 


6378 


606 


191 


GAGPWEAFPDGIGRRSRRARLPQYKRPPGRVGGGDSGRrtNMAOT" 
DLAL I PD VD I DS DG VFKY VL I R VHSAPRSGAPAAES KE I VRG YK 
WAE YHAD I YDKV SGDMQKQGCDCECLGGGR I SHQSQDKKI HVYG 
YSMAYG PAQHAISTEKI KAKYPD YBVTWANDGY 


" 6379 


35' 


J to 


BRAGS PSPSRAALRRCAPQRSQAPRWPDRAACRRSFQGSQGRAY 
LFNS WNVQCG PAEERVLLTGLBAVADI YCENCICTTLGW KYKHA 
FESSQKYKEGKYI IEIiAHMIKDNGWD 


" 6380 


1414 


462 


PAVQGQRGAGPP'lXJRGSGNMARFALTVVRHGETiiFWKEKXIQGQ"™ 
GVD E P LS ETG F KQ AAAAG I FLNNVK FTHAFS S DLMRTKQTMHGI 

LER5KFCKDMTVKYDSRLRERKYGWEGKALSELRAMAKAAREE 
CPVFTP PGGETLDQ VKMRG IDFFE FIjCQL ILKEADQKEQ FSQG S 
PSNCLETSLAE I FPLGKNHSS KVNSDSG I PGLAAS VLWS HGAY 
MRSLFDYFLTDLKCSLPATLSR5ELMSVTPNTGMSLFI ICTFEEG 
REVKPTVQCICMNLQDHLNGLTENSLGL«NLPSKSNHFEPLKGVP 
LALFTSLLC 


6381 
6382 [ 


1668 


218 

it 


AWRAQGSRGFSGAGWRPRQAAAMNFSEVFKtSSLLCKFS PDGK 
YLAS CVQYRLVVRDVNTLQILQLYTCLDQIQHIEWSADSLFIIjC 
AMYKRGLVQVWS LEQ PE WHCKIDEGS AGLVAS CWS PDGRH I LNT 
TEFHLRITVWSLCTKSVSYIKYPKACLQGITFTRIX»RYMAIjAER 
RDCKD YVS I FVCS DWQLLRHFDTDTQDLTG IEV7APNGCVLAVWD 
TCLB YKI LL YSLDGRLLSTYS AYEWS LG I KS VAWS PSSQFIAVG 
S YDGKVRILNHVTW KMI TE FGHPAAINDPKI WYKEAEKS PQLG 
LGCLSFPPPRAGAGPLPSSESKYEIASVPVSLQTLKPVTDRANP 
jaGIGMLAFSPDSYFLATRNDNIPNAVWVWDIQKLRLFAVLEQL 
&f vica* W«OFQQPRIjAICTGGSRLYLWSPAGC1MSVQVPGEGDFA 
VLSLCWHLSGDS MALLS KDHFCLCFLETEAWGTACRQLGGHT 




2 


1062 


feededrnlcliayplkgdhgivdivdnsdcepkskllrwttnk 
khhvletektpkdwvrqhrkeekmkshkleeefewlkksevlyy 

TVEKKGNISSQLKHYNPWSMKCHQQQLQRMKENAKHRNQYKFIIj 

lenltsryevpcvldlkmgtrqhgddaseekaanqirkcqqsts 
avigotvcgmqvyqagsgqlmfmnkyhgrxlsvqgfkealfqff 
hngrylrreixgpvlkkltelkavlbrqesyrfysssllviydq 

KERPEWLDSDAEDLEDLSEESADESAGAYAYKPiaASSVDVRM 

idfahttcrlygedtvvhegodagyifglqslidivteiseesg 

E 


6383 


3159 


1061 


spapgrpsphgsqpaaraaaapampsakqrgskgghgaaspsek 
gahpsaarplaaptpaapacrspspggapasfpgraprslasqp 
aaraaaapampsakorgskgghgaas ps exgahpsggaddvtuck 
pppapqqpppppaphpqqhpqqhpqnqahgkgghrgggggggks 
sssssasaaaaaaaasssascsrrlgralnplfylalvaaaafs 
gwcvhhvlebvqqvrrshqdfsrqreelgqglqgveqkvqslqa 
tfgtfesilrssqhkqdltekavkqgesevsrisevlqklqnei 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A«Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
HoHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PaProline, Q=Glutamine, R=Arginine, 
S=rSerine, T«Threonine, V=»Valine, 
WaTryptophan, Y=Tyrosine, X=»UnJcnown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKDLSDGIHVVKDARERDFTSLENTVEERLTELTKS INDl^I AI F 
TEVQKRSQKEINDMKAKVASLEBSEGNKQDLKALKEAVKE1QTS 
AKSRE WDMEALRS TLQTMESD I YTBVREL VS LXQSQQAFKEAAD 
TERLALQALTEKLLRSEESVSRLPEEIRRLEEBIiRQLKSDSHGP 
KEDGGPRHSEAFEALQQKSQGLDSRLQHVEDGVLSMQVASARQT 
E SLESLLS KSQEKEQRLAALQGR LEGLGS S EADQDGIiASTVRSL 
G ETQLVLYGDVEBLKRS VG ELPSTVESLQKVQEQVHTLLSQDQA 
QAARIiPPQDFLDRLSSIiDNLKASVSQVEADl»KMLRTAVDSLVAY 
S VK1ETNENNLBS AKGLLDDLRNDLDRLFVKVEKIHE KV 


6384 


73 8 


1904 


IWEVPVCtTHLLHLQQAKQPLPPPSSSIUEEDADBANRAIGEKR 
AAPDSGKKPKTPKTKQQKDPNEPQKPVSAYAIiFFRDTQAAIKGQ 
N PNATFGEVS Q I VAS M WDSLGEEQKQVYKRKTE AAKKE YliKALA 
AYRASLVSKAAAESAEAQTIRSVQQTIiASTNLTSSLLLNTPLSQ 
HOT VSAS PQTLGXJSLPRSIAPKPLTMRLPMNQI VTSVTIAANMP 
SNIGAPIilSSMGTTMVGSAPSTQVSPSVQTQQHQMQLQQQQQQQ 
QQQMQQMGXJQQLQQHQMHQQIQQQMQQQHFQHHMQQHLQQQQQH 
LQQQINQQQLQQQLQQRJ^QLQQLQHMQHQSQPS PRQHS PVASQI 
TSP I PAIGS PQPASQQHQSQIQSQTQTQVLSQVSIF 


6385 


2 


15B4 


PRVRAADVAAGAOAVVi^AGMAKSNOF^ qt cptdVc 

IAGOPDAATTDELS SLGSDSEANGFAERRIDKFGFI VGSQGAEG 
ALEE VPLBVLRQRESKWLDMLtmWDKWMAIC^ p 
PSLRGRAWQYLSGGKVKLQQNPGKFDELDMSPGDPKWLDVIERD 
LHRQFPFHEMFVSRGGHGQQDLFRVLKAYTLYRPEEGYCQAQAP 
IAAVljLrttWPAEQAFWCLVOICEKYLPGYYSEKT.EATnT.nnPTT 

FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCAFSRTLPWSSVL 
RVWDMFFCEGVKI IFRVGLVLLIG1AJLGSPEKVKACC<3QYETIER 
LRSLSPKIMQBAFLVQEWELPVTERQIEREHLIQLRRWQETRG 
ELQCRSPPRLHGAKAILDAEPGPRPALQPSPSIRLPLDAPLPGS 
KAKP KP PKQAQKEQRKQMKGRGQIiEKP P APNQAMWAAAGDACP 
PQHVPP KDS AP KDS APQDLAPGVSAHHRSQESLTSQESEDTYL 


6386 


819 


195 


TV€GSFYW3IMQRASRLKREXiHMI*ATEPPPGI TCWQDKDQMDDI* 
RAQI LGGANTPYEKGVFKLEVI IPER YPFBPPQIRFLTPI YHPN 
IDSAGRICXDVLKLPPKGAWRPSLNLATVLTS IQLLMSEPNPDD 
PLMADI SS EFKYNKP AFLKNARQWTEKHARQ KQKADEEEMLDNIj 
PEAGDSRVHNSTQKRKASQLVGI EKKFH PDV 


638? 


1 


662 — 


PGPTHASADAWADAWAQPNMAMHNKAAPPQI PDTRREI*AE1»VKR 
KQELAETLANLERQIYAFEGSYLEE)TQMYGNI IRGWDRYLTKQK 
NSNSKNDRRNRKFKEAERLFSKSSVTSAAAVSALAGVQDQLIEK 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 
STSSGSHHSSHKKRKNKNRHSPSGMFDYDFEIDLKLNKKPRADY 


6388 
— 638 9 


1 


662 


PGPTHASADAWADAWAQPNMAMHNKAAPPQI PDTRRBIjAELVKR 
KOJSLAETLAI^ERQIYT^TOSYLBDTQMYGNIIRGWDRYIjTNQK 

nsnskndrrnrkfkeaerlfskssvtsaaavsalagvqdqliek 
repgsgtesdtspdfhnqenepsqedpedldgsvogvkpqkaas 
stssgshhsshkkrknknrhspsgmfdydfeidlkiankkprady 




Lit 1% 


497 


aepgdrmaghrlvlvlgdlhiphrcnslpakfkioilvpgkiqhr 
lctgnlctkesydylktlagdvhivrgdfdenlnypeqkvvtvg 
ofkiglihghqvipwgdmasi^llqrqfdvdilisghthkfeaf 
ehenkfyinpgsatgaynaletni ipsfvlmdiqastwtyvyq 
ligddvkverieykkp 


6390 


158 


535 


GEERKEGRAPGKAFAPERNPAKMEKEETTRELLIiPNWQGSGSHG 

LTI aqrddgvfvqevtqns paartgwkegdqivgati yfdnlq 

SGEVTQLLNTMGHHTVGLKLHRKGDRFF PS LGQTWDP 


6391 


53B6 


2897 


VRWNS KTE C YIjS I QTQENFPANLNE LVNC I VI SSLVTTQRKIxkA 
MSLLGSRNQLARAVLNPNPMDFCTKDLLTTTSERI iaylrdfne 
DQKKAI ETAYAM VKHS P S VAKI CL IHG P PGTGKS KT 1 VGLLYRL 
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SEQ 
ID • 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A«Alanine, C=Cysteine, D^Aspartic Acid, E* 
Glutamic Acid, F«Phenyl alanine, G*Glycine, 
H*Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknovm, '-stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LTENQRKGHSDENSNAKI KQNRVLVCAPSNAAVDELMKKI ILEF 
KEKCKDKKNPLGNCGDINLVRLGPEKS INSEVLKFS LDSQVNHR 
MKKELPSHVQAMHKRKEFLDYQLDELSRQRALCRGGREIQRQEL 
DENIS KVS KERQEliAS KIKEVQGRPQKTQS 1 1 ILESHI ICCTLS 
TSGGLLLESAFRGQGGVPPSCVIVDEAGQSCEIETLTPLIHRCN 

KLILVGDPKOLPPTVlSMKAOEYGVnftQMNnkOnTiOT t v-BR-nrmnr 

MISPiPlLQLTVQYRMHPDICIiFPSNYVYNRNLKTNRQTEAIRC 
SSDWPPQPYLVPDVGDGSBRRDNDSYINVQEIKLVMEIIKLIKD 
KRKDVSFRNIGIITHYKAQKTMIQKDLDKEFDRKGPAEVDTVDA 
FQGRQKDCVIVTCVRANSIQGS IGFLASLQRIAJVTITRAKYSLF 
ILGHLRTIiMENQHWNQIiIQDAQKRGAI I KTCDKNYRHDAVKILK 

YHTPSDSKE ITLTVTS KDPERPPVHDQLQDPRLLKRMG IEVKGG 
IFLWDPQPSSPQHPGATPPTGEPGFPWHQDLSHVQQPAAWAA 
LSSHKPPVRGEPPAASPEASTCQ3KCDDPEEELCHRREARAFSE 
GEQE KCGS ETHHTRRNSRlfDKRTL EQEDS SS KKRKLIi 


6332 


972 


186 


GRTGVDI^SSMAHRLQIRLLTWDVKlJrLliRlRHPl^EAYAT'KAR - 
AHGIjEVEPSALEQGFRQAYRAQSHSFPNYGLSHGLTSRQWWLDV 
VLQTFHLAGVQDAQAVAP IAEQLYKDFSHPCTWQVLDGAEDTLR 
E CR TRGIiRLA VI SNFDRRLEG IliG GLGLREHFD FVLTS EAAGW P 
KPDPRIFQEALPJiAHMEPVVAAHVGDNYIiCDYO/3PRAVGMHSFL 


*393 


2017 


730 


TGG3 KMAAVATCG3 VAASTGSAVATAS KSNVTS FQRRG PRAS VT 

NDSGPRLVS IAGTRPSVRNGOLLVSTGLPALDQLiLGGGLAVGTV 

LLI EEDKYNI YS PLL FKY FLAEG I VNGHTLL VAS AKEDPANILQ 

ELPAPLLDDKCKKEFDEDVYNHKTPESNI KMKI AWRYQLLPKME 

IGPVSSSRFGHYYDASKRMPQELIEASNWHGFFLPEKISSTLKV 

EPCSLTPGYTKLLQFIQNIIYEEGFDGSNPQKKQRNILRIGIQN 
LGSPIiWGDDICCAENGflNHW^T.TKPT v^rr.onT y dtct cTinrrmM 

PTHLI QNKAI I ARVTTLSD WVGLES FI GSBRETNPLYKD YHGL 
IHIRQI.PRLNNLICDESDVKDLAFKLKRKLFTIERIjHLPPDLSD 
TVSRSSKMDLAESAKRLGPGCGMMAGGKKHLDF 


6394 


1418 


511 


GAAAGGEGARRJiPAAMATVMAAtfAA^ 

VLKGLQDILKEASLRFTIjPGffGTEnTJaiffiWMWTr r»erv^i*r»r\tnr^ 
VLTLQGDALSQAI)VNLKMPRNNQLLHFAFREDKQWKLQQIQDAR 
NHVSQAIYLLTSPJ5QSYQFKTGAEVLKLMDAVMLQLTRARNRLT 
TPATLTLPEIAASGLTRMFAPALPSPLLVNVYINLNKLCLTVYQ 
liHALQPNSTKNFRPAGGAVLHSPGAMFEWGSQRLEVSHVHKVEC 
VIPWLNDALVYFTVSLQLCQQLKDKISVFSSYWSYRPF t 


6395 


13" 


658 


PSGRPTRPLCCAARRGAARriGGSVSGWPAGRTPTETSNPGS S VM 
ESVTFEDVAVEFIQEWALLDSARRSLCKYRMLDQCRTLASRGTP 
P CKPS CVS QLGQRAE PKATERG I LRATGVAWESQLKPEELP SMQ 

DLLEEASSRDMQMGPGLFLRMQLVPSIEERETPLTREDRPALQE 
PPWSLG CTGLKAAMQIQRWI PVPTLGHRNPWVARDSGE 


6396 


1 


1221 


ANIIiSSPSKRGQKGTLlGYSPEGTPLYNFMGDAFQHSSQSIPRF 
IKESLKQILEESDSRQIFYFLCLNLLFTFVELFYGVLTNSLGLI 
SDGFHMLFDCSALVMGLFAALMSRWKATRI FSYGYGRI EILSGF 
INGLFLIVIAFFVFMES VARL I DPPELDTHMLTPVS VGGLI VNL 
IGICAFSHAHSHAHGASQGSCHSSDHSHSHHMHGHSDHGHGHSH 
GSAGGGMNANMRGVFXjHVLADTLGSIGVIVSTVLIEQFXSWFIAI) 
PLCSLFIAILIFLSVVPLIKDACQVLLLRLPPBYEKELHIALEK 
IQKIEGLISYRDPHFWRHSAS I VAGTI HI Q VTS DVLEQRI VQQV 
TGILKDAGWNLTIQVEKEAYFQHMSGI^TGFHDVLAMTKQMES 
MKYCKDGTYIM 


6397 


391 1 


122 


GAGGVGRFE AI RAPARMI E WCNDRLG KKVRVKCNTDDT I GDLK " 
KL IAAQTG TRWN K I VLKKW YT I FKDHVS LGD YE IHDG MNLE L Y Y 
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T SEQ 
ID 
j NO: 


beginning 

nucleotide 

location 
I corresponding 

to first 
I amino acid 
1 residue of 

amino acid 

sequence 


rreaiccea ena 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A»Alanine, C«Cysteine, D»Aspartic Acid, E» 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
HaHistidine, I°Isoleucine, K=*Lysine, 
^Leucine, Methionine, N=As P aragine , 
P=Proline, Q=Glutamine, R=Arginine, 
SsSerine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovn, * a Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 
Q 


h(539a- 


353 


1306 


HKQMGPL INKCKK1 bLPTTVPPATMKl WLLGGl*liP FLLLLSGLQ 

RPTEGSEVAI KIDFD FAPGS FEDQYQGCSKQVMEKLTQGDYFTK 

DI EAQKNYFRM WQKAHLAWLNQGKVL PQNMTTTHAVAI LFYTLN 

SNVHSDFTRAMAS VARTPQQ YERS PH F KYLH YYLTS AI QLLRKD 

S I NENGTL CYEVHYRTKDVH FNAYTGATIRFGQPLS TS LLKEEA 

QEFGNQTLPTIFTCLOAPVQYPSLKKEVLIPPYELFKVINMSYH 
PRGDWLQLRSTGNIiSTYNCOL.TiK'aQQK'tfr'TtsTMJTftT^oT o n T 

VIIFSKSRV 


6399 
h 6400 


75 


1245 


PNLETYFGRRCEKDSMNFTPTHTPVCRKRTVVSKRGVAVSGPTK 
RRGMADSIiESTPIiPS PEDRLAKLHPS KELLE Y YQKKMAE CE AEN 

EDLLKKLBLYKEACEGQHKLECDLQQREEEIAELQKALSDMQVC 
LFOEREHVLRL YS ENDRLR I RELEDKKK I QNI*LALVGTDAGEVT 
YFCKEPPHKVTILQKTIQAVGECEQSESSAFKADPKISKRRPSR 
ERKESSEHYQRDIQTLIIjQVEALQAOLGEQTKLSREQIEGIjIED 
RRlHIiKEIQVQHQRNQNKIKELTKNLHHTQELLYESTKDFLQLR 
SENQNKEKSWMLEKDNLMSKI KQYRVQCKKKEDK1GKVLPVMHE 
SHHAQSEYIKVMSLCRNEWYFSGRVEGIPKNLQFVM 




[ 2520 


1053 


ktmkcdevvykvqsaiu^cgyamktgkffhnlmerkdfetwl 

UWA& v 1 *■ -"oiJ 1 UIjQKNETIiDriL I S LSGAVQ LRHLSNNliE tll kr 

DFLKLLPLBI#SFYl«LKWLDPO/rLLTCCLVSKQWNKVISACTEVW 

O/TACKKLGi^lDDS VQDALHWKKVYLKAI LRM KQLEDHEAFETS 

SLIGHSARVYALYYKDGIiLCTGSDDLSAKLWDVSTGQCVYGIQT 

HTCAAVKFDEQKLVTGSFDNTVACWEWSSGARTQHFRGHTGAVF 
SVDYNDELDTliVSf3QAn , PTVTn7waT Qfcr*nv»r vm*r nvnTrnnr<T«. n . 

*' i «*'«"'-'a,uvi300rtur -i VJlVWAijot/ilxl VJIiNTiiTGHTEWVTICV 

VLQKCXVXSLLHS PGDYILLSADKYE I KI WPIGRE I NCKCLKTIi 
S VSEDRS I CLQPRLHFDGKY1 VCSS ALGLYQWDFAS YD ILRVIK 
TPEIANLALLG FGD I FALLFDNR YL YI MDLRTESLI S RWPLPE Y 
RKSKRGSSFLAGEASWLNGLDGHNDTGLVFATSMPDHS IHLVLW 


6401 
6462 


109 


766 


PGAAWSRPDIjRGCCTGPQPALRMLVLPS pcpqplafss^/etReg" 

PPRRTCRSPEPGP<?^*5TfiQDnaQCDDOOMITVT T TTvnAmmwm.« 

vdeesqrepgasgapgqkkcyscpvcsrvfeymsylqrhsiths 
evk^fecdicgkafkrashlarhhsihlagggrphgcplcprrf 

RDAGBLAQHSRVHSGERPFQCPHCPRRFMEQNTIiQKHTRWKHP 




1196 


279 


TTSQCGGIRQSSAIPVASMEFAAICLRNALLLIiPEEQQDPKQBN 
GAXNSNQLGGNTESSESSETCSSKSHDGDKFIPAPPSSPLRKQE 

lenlkcs ilacsayvalalgdnlmalnhadkllqqpklsgslkf 
lghlyaaealisldrisdaithlnpenvtdvslgissneqdqgs 

DKGENEAMESSGKRAPQCYPSSVNSARTVNLFNLGSAYCLRSEY 

dkarkclhqaasmihpkevppeaillavylelqnontqlalqii 
krnqllpavkthsbvrkkpvfqpvhpiqp iqmpafttvork 


6402 r 
6404 j 


2 

1012 


1690 

1 

J 

222 J 


RGIHTSVLQGNLQNQMYSHNWI^^ 

RSVDDTSQA1QRIKNDFQNLQQVFLQAKKDT0WLKEKVQSLQTL 
AANNSAIiAKANNDTLEDMNSQLNSFTGQMENITTISQANEQNLK 
DLQDIjHKDAENRTA I KFNQLEERFQL FETD I VN I ISNIS YTAHH 

LRTLTSNLNEVRTTCTDTLTKHTDDLTSLNNTLANIRLDSVSLR 

MQQDI*MRS rldtevanls VIMEEMKLVDSKHGQLI KN FT I LQGP 

PGPRGPRGDRGSQGPPGPTGNKGQKGEKQEPGPPGPAGBRGPIG 

PAGPPGERGGKGSKGSQGPKGSRGSPGKPGPQGPSGDPGPPGPP 

3KEG LPGPQGP PGFQGLQGTVGE PG VPG PRGLPGLPG VPGWPGP 

KGPPGPPGPSGAWPLALQNEPTPAPEDNSCPPHWKNFTDKCYY 

FSVEKEIFEDAKLFCEDKSSHLVFINTREEQQWIKKQMVGRESH 

^IGLTDSERBNEWKWLDGTSPDYBCNVJKAGQPDNWGHGHGPGEDC 

W3I)IYAGQWNDFQCEDVNNFICEKDRETVLSSAL 

1AAIAMAAPAFGL I S VFS S SQE LQAAItAQ IjVAQRAACCLAGARA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide _ 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E«= 
Glutamic Acid, FoPhenyl alanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
I^beucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S»Serine, T«Threonine, VoValine, 
W=Tryptophan, Y= Tyro sine, XoUnknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RFALGLSGGSLVSMIiARBLPAAVAPAGPASLARWTUSFCDERLV 
PFXJHAESTYGLYRTHLLSRLPIPESQVITINPELPVEEAAEDYA 
KKLRQAFQGDS I PVFDLLILG VGP DGHTCSLFPDHPLLQERK KI 
VAPISDS PKPPPQRVTLTLPVLNAARTVI FVATGEGKAAVLKRI , 
LEDQEBNPLPAALVQPHTGKLCWFLDEAAARLLTVPFEKHSPL 


6405 


1 


1456 


AALPRPTPRAPLGREGTGSDSEMAASMFYGRIjVAVATLRNHRPR ' 

TAQRAAAQVLGSSGLFNNHGLQVQQQQQRNLSLHEYMSMELLQB 

AGVSVPKGYVAKSPDEAYAIAKKLGSKDWIKAOVLAGGRGKGT 

FESGLKGGVKIVFSPEEAKAVSSQMIGKKLFTKQTGEKBRICNQ 

VLVCERKYPRREYYPAITMERSPQGPVLIGSSHGGVNIEDVAAE 

TPEAI IKEPIDIEEGI KKBQALQLAQKMGFPPNIVESAAENMVK 

LYSLFLKYDATMIBINPMVEDSDGAVLCMDAKINPDSNSAYRQK 

KIFDLQDWTQEDERDKDAAKANLNYIGLDGNIGCLVNGAGIjAMA 

TMD 1 1 KLHGGTPANPIiD VGGGATVHQVTEAFKIiITSDKlCVLAI L 

VNI FGGIMRCD VI AQG IVMAVKDLEI KI P WVRLQGTRVDDAKA 

LIADSGLKILACDDLDEAARMWKIjSEIVTIJUCQAHVDVXPQLP 


" 6406 


103* 


167 


HPRQMRGEDTPEAPPYSSGRYDSIKTEVSGCPBDLTVGRAPTAD ' " 

DDDDDHDDHEDNDKMNDS EGMDPERLKAFNMFVRLFVDEMLDRM 

VPISKQPKEKIQAIIESCSRQFPBFQERARKRIRTYLKSCRRMK 

KNGMEMTRPTPPHLTSAMAENILAAACESETRKAAKRI^RIjEIYO 

S S QDEP I ALDKQKS RDS AAI THS TYSLPASS YSQDPVYANGGUf 

YSYRGYGALSSNLQPPASLOTGNHSNGESGEARALASRPAPSWV 

CRAALGSGMGRGKQRPVMERGCLTA 


6407 


492 


150 


VGLCnAVSQTVLAQLDAIiliVFPGQ VAQLS CTLS PQHVTI RD YG V 
S WYQQRAGSAPRYLLYYRSEEDHHRPADI PDRFSAAKDEAHNAC 
VLTISPVQPEDDADYYCSVGYGFSP 


6408 


1458 


903 

• »■ 


RG C I TSS QAWRIiFGGVTRGFNMRI EKCY FCSG P I YPGHGMMF VR 
NDCKVFRFCKS KCHKNFKKKRNPRKVRWTKAFRKAAGKELTVDN 
SFEFEKRRNEPIKYQRELWNXTIDAMKRVEBIKQKRQAKFIMNR 
LKKNKELQKVQDIKBVKQNXHLIRAPLAGKGKQLEEKMVQQLQE 
DVDMEDAP 


6409 


150 


446 


NTALANLLRCFTCDRLCGGCTAPAPPAHQGIVLQPVMPSCDPGP 
GPACLPTKTFRSYIjPRCHRTYSCVHCRAHLAKHDELISKSFQGS 
HGRAYLFNSV 


6410 


85 


607 


RGGTAGCVACLGCWGQSSSPKAAFPAGSACLPADSCPCLLFQAC 
AISGLFNCITIHPLNIAAGVWMIMNAFILLLCEAPFCCQFIEFA 
NTVAEKVDRLRS WQKAVFYCGMAWP I VISLTLTTLLGNAIAFA 
TGVLYGLSALGKKGDAISYARIQQQRQQADEEKLAETLBGEL 


5411 


302 


772 


RLSIMASSLNEDPEGSRITYVKGDLFACPKTDSLAHCISEDCRM 
GAG I AVLFKKKFGGVQELIiNQQKKSGEVAVLKRDGRYI YYIilTK 
KRASHKPX^ENLQKSLEAMKSHCLKNGVTDLSMPRIGCGLDRLQ 
WENVS AM I E EVFEATD I K I TVYTL 


6412 

- 


61 


1709 


RPVTSFSPLPGSCGGRLGTRTMLGRSLREVSAAliKQGQITPTEL " 

CQKCLSLIKKTKFLNAYITVSEEVALKQAEESEKRYKNGQSLGD 

LDGIPIAVKDNFSTSGIETTGASNMLKGYIPPYKATWQKLLDQ 

GALLMGKTNIiDEFAMGSGSTDGVPGPVKNPWSYSKQYREKRKQN 

PHSENEDSDWLITGGSSGGSAAAVSAFTCYAALGSDTGGSTRNP 

AAHCGLVGFKPS YGLVSRHGL IPLVNS MDVPG I LTRCVDDAAI V 

I/SAIiAGPDPRDSTTVHEPINKPFMLPSLADVSKLCIGIPKEYLV 

PELSS E VQSLWS KAAD L FE S EGAKVT EVS LPHTS YS I VC YHVLC 

TSEVASNMARFDGLQ YGHRCD I DVS TEAMYAATRREG FN0 WRG 

RILSGNFFLLKENYENYFVKAQKVRRLIANDFVNAFTISGVDVLIj 

TPTTLSEAVPYIjEFIKEDNRTRSAQDDIFTQAVNMAGLPAVS IP 

VALSNOGLPIGLQFIGRAFCDQQLLTVAKWFEKQVQFPVIQLQE 

LMDDCSAVLBNEKLASVSLKQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A»Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, MsMethionine, N=Asparagine, 
?*Proline, Q=Glutamine, R«Arginine, 
SaSerine, T-Threonine, V*>Valine, 
W«Tryptophan, Y-Tyrosine, X -Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6413 


2 


885 


HEPRCAGMAASLWMGDLEPYMDBNFISRAF'A'l'MGETVMSVKliR 
NRLTG I PAG YCFVE FADLATAE KCLH KINGKPL PG ATPAKRFKL 
NYATYGKQPDNSPEYSLFVGDLTPDVDDGMLYEFFVKVYPSCRG 
GKVVLDQTGVSKGYGFVKPTDBZ.EQKRALTECQGAVGLGSKPVR 
LSVAI PKAS RVKPVEYSQMYS YS YNQYYQQ YQN YYAQWGYDQNT 
GSYSYSYPQYGYTQSTMQTYEEVGDDALEDPMPQLDVTEANKEF 
MEQSEELYDALMDCHWQPLDTVSSEIPAMM 


6414 


1 


538 


RGGRAALLPWRRFPCCRPKPQPARPSSRATPGPRSPGMATSIGV 
S FSVGDGVPEABKNAGE PENTY I LRP VFQQR FRPS WKD CIHAV 
LKEELANAEYSPEEMPQLTKHLSENIKDKLKEMGFDRYKMWQV 
VIGEQRGEGVFMASRCFVTOADTDNYTHDVFMNDSLFCVVAAFGC 
FYY 


6415 


2 


1168 


FVRQWQSSHRRACGLGCEARAGGGEEPRGRASSVAGWVGAFRAP" " 
FIEAAVAGIX^GSGKRRRGWKMPVHSRGDKKETNHHDEMEVDYA 
EWEGSSSEDEDTESSSVSEDGDSSEMDDEDCERRRMECIiDEMSN 
LEKQFTDLKDQLYKERLSQVDAKLQEVIAGKAPEYLEPLATLQE 
NMQ I RTKVAG t YRELCLES VKNKYECE IQ AS RQHCES E KLLLYD 
TVQSELEEKIRRLEEDRHSIDITSELWNDELQSRKKRKDPFWPD 
KKKPGVVSGPYIVYMLQDLtDILEDWTTIRKAMATLGPHRVKTEP 
PVKLEKHLHSARSEEGRLYYDGEW YIRGQT I C IDKKDECPTSAV 
ITT INHDE VW FKRPDGS KSKL Y I S QLQKG KYS I KHS 


6416 


410 


1519 * 


EIAPADLEIPACAPVLLSRATSSTMSVTGGKMAPSLTQEiLSHIi 
GLASKTAAWGTLGTLRTFLNFSVDKDAORLLRAITGQGVDRSAI 
VDVLTNRSREQRQLISRNFQERTQQDLMKS LQAALSGNLERI VM 
ALLQPTAQFDAQELRTALKASDS AVDVAI E I LATRTP PQLQECL 
AVYKHNFQVEAVDG ITS ETSG I LQDLLLALAKGGRDS YS GI I D Y 
NIiAEQDVQAIiQRAEGPSREETWVPVFTQRNPEHLIRVFDQYQRS 
TGQELEEAVQNRFHGDAQVALI^5LAS VIKNTPLY FADKLHQALQ 
ETEPNYQVL IR I L ISRCETDLLS I RABFRKKFGKSLYSS LQDAV 
KGDCQSALLALCRAEDM 


. 6417 


1 


r 845 


RGESR VLWS ELEGEAGGAGGWAS S Ltf ARMDNRFATAPVIACVLS 1 
LI STI YMAAS IGTDF WYEYRS P VQENSSDLNKS IWDEFI S DEAD 
EKTYKDALFRYNGTVGIiWRRCITI PXNMH WYS PPERTES FDWT 
KCVSFTLTEQFMEKFVDPGNHNSGIDLLRTYLWRCQFLLPFVSL 
GLMCFGALIOLCACICRSLYPTIATG I LHLLAGLCTIiGS VSCYV 
AG I ELL HQKLEIiPDNVS GE FG WS FCLACVSAPLQFMASAL FIWA 
AHTNRKEYTLM KAYR VA 


6418 


2 


662 


TRTRPRRPPGLGAAVGKAGARSTS TPAGAS PAAAYQADP P P PAH 
TPAPPPPPPOGGIACHGEPAKFYG YDNLQRQP I FTTQQEAELVQ 
YPDCKS SSGNI GEDPDHLNQSS S P S QMF PWMRPQAAPGRRRGRQ 
TYS RFQTLELE KE FLFNP YLTRKRR I EVS HALALTE RQVKI WFQ 
NRRMKWKKENNKDKFPVSRQEVKDGETKKEAQELEEDRAEGLTN 


6419 


1 


973 


PGRPRVRNFDLNSKSILQEFFCTRSIQIPANRSKTAMSKCPIFP 
MARS ISTSGPLDKEDTGRQKLISTGSLPATLQGATDSLGLEWHL 
PS PD P VTVPYLS PL WWKELESLLB MEG DHAI TVAD FVDHHPI V 
FWNLVWYFRRLDLPSNLPGL I LS SB HCNKYS K I PRH CMS EDSKY 
VL I QMLWDNMKLHQDPGQ P L Y I LWNAHTQKY PMVHLL QKSDNS F 
NQELLKSMVKSIKMNDVYGPMSQILETLNKCPHFKRQRSLYREI 
LFLS L VALGREN I D I DAFDKE YKMAYDRLTPSQVKSTHNCDRPP 
STGVMECRKTFGEPYL 


6420 


207 


1187 


RKMIDKNQTCGVGQDSVPYMICLIHILEEWFGVEQLEDYLNFAN 
YLLWVFTPLILLILPYFTI FLLYLTI I FLHI YKRKNVLKEAYSH 
NLWDGARKTVATLWDGHAAVWHGYEVHGMEKIPEDGPALI I FYH 
GAI P I DFYYFMAKI FIHKGRTCRWADH FV FKI PGFSLLLDVFC 
ALHGPREKCVEILRSGHLLAISPGGVREALISDETYNIVWGHRR 
GFAQVAIDAKVP I I PMFTQNI REGFRSLGGTRLFRWLYEKFRYP 
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SEQ 
ID 
NO: 



6421 



6422 



6423 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



"362- 



Amino acia segment containing signal peptide 
(A=Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«*Asparagine, 
P= Proline, QsGlutamine, R=Arginine, 
S=Serine, ^Threonine, V-Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
^possible nucleoti de insertion) 

FAPMYGGPPVKLRTYLGDPIPYDPQITAEKlAKKTKNAVQAtlD 
KHQRI PGNIMS ALLBRFH 

| WALSLRRQPfiRMSN KLLSPHPHS WLRS EFKMASSPAVLRASRL 
YQWSLKSSAQPLGSPQLRQVGQIIRVPARHAATLILEPAGRCCW 
DEPVRIAVRGLAPEQPVTLRASLRDBKGALFQAHARYRADTLGE 
LDLERAPALGGS FAGLE PMGLLWALE PE KPL VRLVKRD VRTPIiA 
VELB VLDGHDPDPG RLLCQTRHER YFLP PG VRRE PVRVGRVRGT 
LFLPPEPGPFPG1VDMFGTGGGLLEYRASLLAGKGFAVMALAYY 
NYEDLPKTMETLHLEYFEEAMNYLLSHPEVKGPGVGLLGISKGG 
ELCLSMAS FLKG 1TAAWINGS VANVGGTLRYKGETLP PVGVNR 
NRIKVTKDGYADIVDVLNSPLEGPDQKSFIPVBRAESTFIiFLVG 
I QDDHNWKS E PYANEACKRLOAKGRRKPQ 1 1 CYPETGHY I EPPYF 
■ PLCRAS LHALVG3 P I IWGGEPRAHAMAQVDAWKQLQTFFHKHLG 
GREGTIPSKV 



2133 



614 



1237 



EGENLSWFQEFWGDIAKEFYWKTPCPGPFliRYNFDVTKd»KIFIE~ 
WMKGATTNICYNVLDRNVHEKKLGDKVAFYWEGNEPGETTQirY 
HQLLVQVCQ PSNVLRKQGI HKGDRVAX YMPM I PELWAM LACAR 

IGALHSIVFAGFSSESLCERILDSSCSLLITTDAFYRGEKLVNL 
KELADEALQKCQEKGFPVRCCIVVKHLGRAELGMGDSTSQSPPI 
KRS CPDVQI SWNQGIDLWWHELMQEAGDECE PEWCDAEDPLFIL 
YTSGSTGKPKGWHTVGGYMLYVATTFKYVFDFHAEDVFWCTAD 
IGW I TGHS YVTYGPLANGATS VLFEG I PTYPDVNRLWS I VDKYK 
VTKFYTAPTAIRLLMK FGDEPVTKHSRASLQ VLGTVGE P INPEA 
WLWYHRWGAQRCPTVDTFWQTETGGHMLTPLPGATPMKPGSAT 
FPFFGVAPAILNESGEELEGEAEGYLVFKQPWPGIMRTVYGNHE 
RFETTYFKKPPGYYVTGDGCQRDQDGYYWITGRIDDMLNVSGHL 
LSTAEVESALVEHEAVAEAAVVGHPHPVKGECLYCFVTLCDGHT 
FSPKLTBELKKQXREKIGPIATPDYIQNAPGLPKTRSGKIMRRV 
' LRKIAQNDHDLGDMSTVADPSVISHLFSHRCLTIQ 



11BB 



ANLKE I PRDliP PETVLL YIiDSNQI TS i! PitfE I FKDtikQlji VLNLS" 
KNGIEFIDEHAFKGVAETLQTLDI^BNRIQSVHKNAFNNLKARA 
RIANNPWHCDCTLQQVLRSMASNHETAHNVI CKTSVLDEHAGRP 
FLNAANIWDLCNLPKKTTDYAMLVTMFGWFTMVISYVVYYVRQN 
QEDARRHLEYLKSLPSRQKKADEPDDISTW 



KKVSWPVAAMVHCSCVLFRKYGNFIDKIiRLFTRGGSGGMGYPRL" 
GGEGGKGGDVWWAHNRMTIjKQLKDRYPRKRFVAGVGANSKISA 
LKGS KGKDWEI PVPVGI S VTDENGKI IGELNKENDRI LVAQGGL 
GGKLLTNFLPLKGQKR I IHLDLKIilADVGLVGFPNAGKSSLLSC 
VSHAKPAIADYAFTTLKPELGKlMYSDFKQrSVADLPGLIEGAH 
MNKGMGHKFLKHIERTRQLLFVVDISGFQLSSHTQYRTAFETII 
LLTKEI^LYKEELQTKPALLAVNKMDLPDAQDKFHELMSQLQNP 
KDFLHLFEKNMIPERTVBFQHIIPISAVTGEGIEEIjKNCIRKSL 
DEQANQENPALHKKQLLNLWI3DTM3STEPPSKHAVTTSKMDII 



T42T 



30 



"56T 



IiAMEGGGGIPLETLKEESQSRHVLPASFEWSI^KSNWGFLlTTG 
LVGGTLVAVYAVATPFVTPALRKVCLPFVPATMKQ I ENWKMIiR 
CRRGS LVDIGSGDGRI VIAAAKKGFTAVG YELNP WLVWYS RYRA 
WREG VHGSAKFY I SDLWKVTFSQYSNVVT FGVPQMMLQLEKKLE 

RELEDDARVIACRFPFPHWTPDHVTGEGIDTVWAYDASTFRGRE 
KRPCTSMHFQLPIQA 



"6427" 



145 



SRGAAVGGMSVAGGEIRGDTGGEDTAAPGRFS FS PEPTLED I RR~ 
LHAE FAAERDWBQFHQPRNLLLALVGE VGELABLFQW KTDGE PG 
PQGWS PRERAALQEELSDVLI YLVAIiAARCRVDLPLAVLSKTSDI 
NRRRYPAHLARS SSRKYTELPHGAISEDQAVGPADI PCDSTGQT 
ST 



959 I AASWGPPHVPKAGKMVSWMICRIjWLVFGMLCPAYASYKAVKTK" 

NIREYVRWMMYWiyFALFMAAEIVTDIFISWFPFYYEIKMAFVL 
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SeSerine, T=Threonine, V=»valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








WLLS P YTKGAS LL YR KF VHP S LSRHE KS I DAY I VQAKE R S Y ET V '" 
LS FGKRGLNIAASAAVQAATKSQGALAGRLRS FSMQDLRS I SDA 
PAPAYHDPLYLEDQVSHRRPPIGYRAGGLQDSDTEDECWSDTEA 
VPRAPAR PREK PLI RSQS LR WKR KPP VRBGTSRS L KVRTRKKT 
VPS DVDS 


*428 


1982 


444 


sgsggkmedhqhvpidiqtsklldwlvdrrhcslkwqsi.vltir , ~ 

EKINAAIQDMPESEEIAQLLSQSY1HYFHCLRILDLLKGTEAST 
KN1 FGR YS SQRMKDWQE 1 1 ALYE KDNTYL VELS S LLVRNVNYEI 
PSLKKQIAKCQQLQQEYSRKEEECQAGAAEMREQFYHSCKQYGI 
TGENV^GEIJJUjVKDLPSQIAE IGAAAQQS LGEAI DVYQAS VGF 
VCESPTEQVLPMLRFVQKRGNSTVYE WRTGTE PS WERPHLEBL 
PEQVAEDAID WGDFGVEAVSEGTDSGI SAEAAGXDWG I FPES DS 
KDPGG DG ID WGD DAVALQ I TVLfiAGTQAPEG VARGPDALTLLEY 
TBTRNQFLDEIiMELElFIiAQRAVELSEEADVIiSVSQFQLAPAIL 
QGG/TKEKMVITIVSVIiEDIjIGKLTSLQLQHLFMILASPRYVDRVT 
EFLQQKLKQSQLLALKKELKVQKQQEALEEQAALEPKLDLLLEK 
TKELQKLI BAD I S KRYSGRP VNLMGTSL 


6429 


3413 


3442 


epsswtaaprgpiaahpLeaaVqeddrralsfdsrikvfangtl 
wksvtdkdagdylcvarnkvgddywlkvdvvmkpaki ehkee 
ndhkvfyggdlkvdcvatgxipnpelswsdpdgslvnsfmqsdds 
ggrtkrywfnngtlyfnevgmreegdytc faenqvgkdemrvr 

VKWTAPAT1 RNKT CLAVQ VP YGD WTV ACEAKG E P M P KVTW LS 

ptnkviptssbkyqiyqdgtlliqkaqrsdsgnytclvrnsage • 
drktvwihvnvqppkingnpnpittvreiaaggsrklidckaeg 
iptprvlwafpegvvlpapyygnritvhgngsldirslrksdsv 
qlvcmarneggearlivqltvlepmekp ifhdpi sekitamagh 
tislncsaagtptpslvwvlpngtdlqsgqqlqrfyhkadgmlh 

ISGLSSVDAGAYRCVARNAAGHTERLVSIiKVGIjKPEANKQYHNL 
VSIINGETLKLPCTPPGAGQGRFSWTLPNGMHLEGPQTLGRVSIj 
LDNGTLTVREAS VFDRGTYVCRMETEYGPS VTS I P VI VIAYPPR 
ITSEPTPVTYTRPGNTViaiNCMA^IPKAOITWELPD^ 

VQARLYGNRFLHPQGSLTIQHATQRI1AGFYKCMAKNILGSDSKT 
TYIHVF 


6430 


1946 


602 


RTRVSTGIjRRTLLWSEAVGASSTRGDTGIPGSGEGGAGPGGGEG " 

AMLEAMAE PSPBDP PPTLKPETQPPEKRRRTI ED FNKFCS F VLA 

YAGYIPPSKEESDWPASGSSSPLRGESAADSDGWDSAPSDLRTI 

0TFVKKAKSSKRRAA0AGPTQPGPPR5TFSRLQAPDSATLLEKM 

KLKDSLFDLDGPKVASPLSPTSIjTHTSRPPAALTPVPLSQGDLS 

HPPRKKDRKNRKLGPGAGAGFGVLRRPRPTPGDGEKRSRIKKSK 

KRKLKKAERGDRLPPPGPPQAPPSDTDSEEEBEEEEEEEEBEMA 

TVVGGEAPVPVLPTPPEAPRPPATVHPEGVPPADSESKEVGSTE 

TSQDGDASSSEGEMRVMDEDIMVESGDDSWDLITCYCRKPFAGR 

P MI E CS LCGTWIHLS CAKI KKTNVPDFFYCQKCKELRPE ARRLG 

GPPKSGEP 


6431 


3 


605 


WWNSSYNLPAYAPYLPCEACAMQDGRKGGAYAGKMEATTAGVGR"" 
LEEEALRRKERLKALREKTGRKDKEDGEPKTKHLREEEEEGEKH 
RELRLRNYVPEDEDLKKRRVPQAKP VAVEE KVKEQLEAAKPEP V 
IEEVDLANUIPRKPDWDLKRDVAKKLEKLKKRTQRAIAELIRER 
LKGQEDS LASAVDAATEQKTCDSD 


6432 


56 


1692 


GGLGTMGSRIKQNPETTFEVYVEVAYPRTGGTLSDPEVQRQFPE 
D YSDQE VLQTLTKFCFP FYVDSLT VSQ VGQNFTFVLTDI DS KQR 
FGFCRLSSGAKSCFCILSYLPWFEVFYKLLNILADYTTKRQENQ 
WNELLETLHKLP I PDPGVSVHLS VHSYFTVPDTRELPS I PENRN 
LTE YFVAVDVNNMLHLYASMLYERRILI I CSKLSTLTACIHGSA 
AMLYPMYVJQHVYIPVLPPHLLDYCCAPMPYLIGIHLSLMEKVRN 
MALDDWlLNVDTOTLETPFDDLQSLPNDVISSLKNRIiKKVSTT 
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\opossinle nucleotide insertion) 








TGDGVARAPLKAQAAFFGSYRNALKIEPBBPITFCBBAFVSHYR 
SGAMRQFLQNATQLOLFKQFIDGRLDLLNSGEG PSDVFEEEHJM 
\jb ih.\3S3U3\LiX nUWi-kii I VKKGSGAI IiNTVKTKANPAMKTVYKFDI 
ABNGCAPTPEEQLPKTAPSPLVEAKDPKLRBDRRPITVHFGQVR 
PPRPHWKRPKSNIAVEGRRTSVPSPBQNTIATPATLHILQKSI 
TKFAAKFPTRGWTSSSH 


6433 


1524 


484 


APVTKRKEVFAXDSKGSALDAGRDPKRPALPErLCESGWASNTA 
PTTPP QPGWCLCGKD PKSS CQTPGRE KERR1ATMHGS CS FLMLL 
LPLLLLLVATTGPVGALTDEEKRLMVELHNI*yRAQVSPTASDML 
HNRWDEELAAF AXAYARQCVWGHNKERGRRGBNIiFAlTDEGMDV 
PLAMEEWHHEREHYNLSAATCSPGQMCGHYTQWWAKTERIGCG 
SHFCEKLO^VEETNTELLVCNYEPPGNVXQKRPYQEGTPCSQCP 
SGYHC KNSLCE P I GS PEDAQDLP YLVTEAPS FRATEAS DSRKMG 
AEGPDKPSWSGLNSGPGHVWGPLLGIjLLLPPLVIiAGIF 


6434 


40 


2002 


mpqi*nfgmadptqmggls^llagehalgtpevfsgtcrpdVse 

S PELRQKS PL FQFAEI S SSTSHSDASTKQCQTS AL FQFAE ISSN 
TSQLGGAEPVKRCGKSALtXJIiAEMCLASEGMKMEES KLI KAKES 
DGGRIKELEKGKEEKEIKMEXTDETRLQKEAEFEKSAKBNIjRDS 
KELRNFEALQ IDD I MA I KMEDPKE IRKEELEEDHKCS HFPD FS Y 
SASSKII I S DV PS RKDHMCH PHG IMI IEDPAALNKPEKLKKKKK 
KSKMDRHGNDKSTPKKTCKKRQSSESDIESVIYTIBAVAKGDWG 
IEKLGDTPRKKVRTSSSGKGSILI1AKJPPKKKVKSREKKMSKEICS 
S DTTKES RP PDF" S 1 3 AS KN I SGETPEG I KAE PLTPMEDALP PS 
LSGQAKPEDSDCHRKlETCGSRKSERSCKGAIiYKTLVSEGMLTS 
LRANVDRGKRSSGKGNSSDHEGCWNBESWTFSQSGTSGSKKFKK 
TKPKEDCLLGSAKLDEBFEKKFNSIiPQYS PVTFDRJCCVPVPRKK 
KKTGNVSSEPTKTSKGSGDKWSNKQLFLDAIHPTEAIFSEDRNT 
MEPVHKVKNIPSIFNTPEPTTTARTFGGQPKEKSKENPDYSPCQ 
DTQRAGYHHEEVLWMTNI^NNCGGVYLKQIiRHTAMT^ 


6435 
~6?36 


2227 

•r 


Fc 5 ? 

©57 


ALQRCAAAAYAH PE YEE RFLQEET VS QQINS IELLQTRPLALPE ~ 
WKSQRPLQRQVHLRGRPASQPTVIRG I TYYKAKVS EEEND IE E 
QQD E F FS GDNG VDLLi I EDQL LRHNGLMTS VTRRP AATRQGH S TA 
VTSDLNARTAPWSSALPQPSTSDPSIANHASVGPTLQTTSVSPD 
PTRESVLQPSPO^PATTVAHTATQQPAAPAPPAVSPREALMBAM 
HTVP VP P TTVRTDSLGKDAPAGRGTTPAS PTI*SPEBEDDIRNVI 
GRCKDTLSTITGPTTQNTYGRNEGAWMKDPLAKDERIYVTNYYY 
GNTLVEFRNLENFKQGRWSNSYKLPYSWIGTGHVVYNGAFYYNR 
*** A ^ A *i!tiuiji^KyvAAWAMI*HDVAY^ 

VDENGLWLIYPALDDEGFSQEV1VLSKLNAADLSTQKETTWRTG 
LRRNFYGNCFVICGVLYAVDSYNQRNANISTAFDTHTNTQIVPR 
LLFENEYFYTTQIDYNPKDRLLYAWDNGHQVTYHVIFAY j 




1295 


341 


GACR PP VRQDPDSGP D YEALPAGATVTTHWVAGAVAG I LEHC VM 
YP IDCVKTRMQS LQPDPAARYRNVLEALWRI IRTEGLWRPMRGL 
NVTATGAGPAHALYFACYEKLKKTLSDVIHPGGNSHIANGAAGC 
VATLLHDAAMNPAE WKQRMQMYNS P YHRVTDCVRAVWQNEGAG 
AFYR5YTTQLTMNVPFQAIHFMTYEFLQBHFNPQRRYNPSSHVL 
SGACAGAVAAAATTPLDVCKTLLNTQ BSI»AIjNS HITGHI TGMAS 
AFRTVYQVGG VTAY FRGVQARVI YQI PSTAIAWS VYEF FKYLIT 
KRQEEWRAGK 


6437 


1B2B 


360 . 

1 


P PAPAPPAS P ARHVTRTARGHLEGGS RAPPLLQAVFLQI KNMVK 
L I HTLADHGDDVNCCAFS FSLLATCS LDKTIRL YS LRDFTEL PH 
SPLKFHTYAVHCCCFS PSGH ILAS CS TDGTTVLWNTENGQMIiAV 
MEQPSGSPVRVCQFSPDSTCLASGAADGTVVLWNAQSYKLYRGG 
SVKDGSLAACAFSPNGSFF\nt3SSCGDLTVWDDKMRCL^EKAH 
DLGITCCDFSSQPVSDGEQGLQFPRLASCGQDCQVKIWIVSFTH 
ILGFELKYKSTLSGHCAPVLACAFSHDGQMLVSGSVDKSVIVYD 
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S=Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /*.possible nucleotide deletion, 
\=possible nucleotide insertion) 








TNTENILHTLTQHTRYVTTCAPAPNTLLLATGSMDKTVNI WQFD 
LBTIiCQARSTEHQIjKQFTEDWSEEDVSTWLCAQDLKDLVGIPKM 
NNIIX5KBLLNLTIG3SLADDLIQESLGLRSKVLRKIEELRTKVKS 

lssgipdepicpitrelmkdpviasdgysyekeamenwdpakrn 

RTSPP 


6438 


109 


901 


BVQILRAKMFQTGGLIVFYGLLAQTf^AQFGGIiPVPLDQTLPLNV 
NPAIiPLSPTGLAGSLTNALSNGLLSGGLLGILEHLPLLDI lkpg 

ggtsggli^li^kvtsvipgl^iidikvtdpqllelglvqsp 

DGHRLYVTI PLGI KLQVNTPLVGASLLRLAVKLD ITAE ILAVRD 
XQER 1 HLVLGDCTHS PGSLQ I S LL DG LGPLP I QGLLDSI/TG I LN 
KVLPELVOGNVCPLVNEVLRGLDITLVHDIVNMLIHGLQFVIKV 


6439 


23 


412 


siqtasaittemasqsqgiqqllqaekraaekvadarkrkarrl 

KQAKEEAQMEVBQYRREREHEF^SKQQAAMGSQGNIjSAEVEQAT 
RRQVQGMQSSQQRNRERVLAQLLGMVCDVRPQVHPNYRISA 


6440 


3 


517 


RARWNSDMGOLPGLVRIiS IALRIQPNDGPVFY KVDGQr'fGQNRT 
I KLLTGSSYKVE VKI KPS TLOVEN I S IGGVLVPLFT .(r c vp tmr«^ 
RVVYTGTYDTEGVTPTKSGERQPIQITMPFTDIGTFETVWQVKP 

ynyhkrdhcqwgspfsvieyeckpnetrslmwvnkesfl 


6441 


234 


1373 


KSGaiARRQRPQRSAAVGEEELPPGMEKFKAAMLLGSVGDALGY 
RNVCKENSTVGMKIQSELQRSGGLDHLVLSPGEWPVSDNTIMHI 
ATABAIiTTD YWCLDDLYREMVRCYVE I VEKL P EH R PDPAT I EGC 
AQLKPNNYLLAWH'rPFNBKGSGFGAATICAMCIGliRYMKDPPT en 
LIEVSVECGRMTHNHPTGFLGSLCTALFVSFAAQGKPLVQWGRD 
MLRAVPLAEE YCR RTIRHTAE YQEHWFYFEAKWQ FYLEERK I S K 

DSENKAIFPDNYDAEEREKTYRKWSSEGRGGRRGHDAPMIAYDA 
LLAAGUS WTEL CH RAM FHG GES AATGT IAG CLFG JUL YG LDL VP K 
GLYQDLEDKEKLEDLGAALYRLSTEBK 


6442 


34 


796 


aedpag^lagqdtmfarglkrkcvgheedvegalaglktvssys 
lqrqslldmslvklqu:hmlvepnlcrsvliantvrqiqeemtq 

DGTWRTVAPQAAERAPLDRLVSTE I LCRAAWGQEGAHPAS GLGD 
GHTCJGP VSDLCPVTSAQAPRHLQSSAWEMDG PRENRGS FHKSLD 
QIFETLETKNPSCMEELFSDVDSPYYDLDTVLTGMMGGARPGPC 
EGLBGLAPATPGPSSSCKSDLGELDHWEILVET 


6443 
6444 ' 


2 


555 


^PAASSVRPPRPKXEPQTLVIPKNAAEEQKLKLERLMKNPDK 

AVPIPEKMSEWAPRPPPEFVRDVMGSSAGAGSGEFHVYRHLRRR 

EYQRQDYMDAMAKKQKIjDAEFQKRIjEKNKIAAEEQTAKRRKKRQ 

KIiKEKKLLAKKMKLEQKKQEGPGQPKEQGSSSSAEASGTEBEEE 
VPSFTMGR 




390 


899 


GSTPRGKMRAPIPEPKPGDLIEIFRPFYRHWAIYVGDGYWHLA 
P PS EVAGAGAAS VMS ALiTDKAI VKKELL YDVAGSDKYQVNN KHD 
DKYSPLPCSKI IQRAEELVGQEVL YKLTS ENCEHFVNELRYGVA 
RSDQVRDVIIAASVAGMGLAAMSLIGVMFSRNKRQKQ 


6445 


2 




AG AAGAAGAARSPRPQAHTKGVRGI#PSRRRS PDCGRMELAAGS F 
SEEQFWEACAEI^PALAGADWQLLVBTSGISIYRLLDKKTGLY 
BYKVFGVLEDCSPTIiLADIYMDSDYRKQWDQYVKELYEQECNGE 
TWYWEVKYP FPMSNRD YVYLRQ RRDLDMEG RKI HVI LAR S TS M 
PQLGERSGVIRVKQYKQSLAIESDGKKGS KVFMYYFDNPGGQI p 
SWLINWAAKNGVPNFLKDMARACQNYLKKT 


6446 


1 ~| 


1651 


KUPTKHPPPDTPGSRGTTAMCJStiASGATGGRGAVENEEDLPECr- 
DSGDEAAWEDEDDADLPHGKQQTPCLFCNRLFTSAEBTFSHCKS 
EHQ FNI DSMVHKHGLE F YGY I KIi I N FIRLKNPTVEYMNS I YNP V 
PWEKEEYLKPVLEDDI.LLQFDVEDLYEPVSVPFSYPNGLSENTS 
^BKLKHMEARAIjSAEAALARAREDLQKM KQFAQDFV^IHTDVR^ 
CSSSTS VTADLQEDEDGVYFSSYGHYGIHEEMLKDKIRTESYRD 
FI YQNPH I FKDKWIiDVGCGTGl LSMFAAKAGAKKVLGVDQSEI 
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Amino acid segment containing signal peptide " 
(A-Alanine, C=Cysteine, D=Aspartic Acid, En 
Glutamic Acid, F= Phenylalanine, G*Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
Ii=Leucine, M=Methionine, N«Asparagine, 
i--rtuixue, yaiaiucanine, RssArginine, 
SaSerine, T«Threonine, V« Valine, 
WoTryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
_ » ywooxuie nucxeocgqe insertion) 








fcYQAMDI IRLNKLEDTITL I KGKIEE VHLP VEKVDVi iSEWMGY 
FLLFESMLDSVLYAKKKYLAKGGSVYPDICTISLVAVSDVNKHA 
DRIAPWDDVYGPKMSCMKKAVIPEAVVEVIJDPKTLISEPCGIKH 
xuchi loXSDLEFSSDFTLKITRTSMCTAIAGYFDiyFEKNCHN 
RVVFSTGPQS TKTHWKQTVFLLEKPFS VKAGEALKGKVTVHKNK 
KDPRS IiTVTLTLNNSTQT YGLQ 


6447 


1554 


1068 


RXiGPASWHLSGPCHATLGAANRGRAUSVRAAWRGA^LCQRVMMP 
SRTT^TGIPSSKVKYSRLSSTDDGYIDLQFKKTPPKIPyKAIA 
IaATVLFL I GAFL I I1GSLLLSGYISKGGADRAVPVLIIGILVFL 
PG FYHLR I A YYAS KG YRG YS YDDI PDFDD 


6448 


74 


559 


GQVLSHCYHYRSSRWRRGGLSRGRGAGVMALVPYEETTEFGLQK 
FHICPIjATFS FANHTI QIRQDWRHLGVAAVVWDAA I VLS TYLEMG 
AVELRGRSAVEIX3AGTGLVG I VAALLACR I R YERDNNFIiAMLER 
QFIVRKVHYDPEKDVHIYEAQKRNQKEDL 


6449 
6450 


597 


1876 


EYGVCENLRKLEITGVSCRDVYAKLLHRYRriiLGtWQPDrGPYG 
GLLNVWDGLFI I GWM YL P PHD PHVDD PMRFKP LFRI HLMERKA 

ATVECMYGHKGPHHGHIQIVKKDEFSTKCNQTDHHRMSGGRQEE 
FRTWLREEWGRTLEDI FHEHMQBLI LMKF I YTSQYDNCLT YRRI 
YLPPSRPDDLIKPGLFKGTYGSHGLEIVMLSFHGRRARGTKITC 
DPNIPAGQQTVEIDLRHRIQLPDLENQRNFNELSRIVLBVRERV 
RQEQQEGGHEAGEGRGRQGPRESQPSPAQPRAEAPSKGPDGTPG 
EDGGEPGDAVAAAKQPAQCGQGQPFVLPVGVS SRNEDYPRTCRM 
C F YGTGL I AGHG FT S P E RTPGVF I LFDEDRFG FVWLELKS FSL Y 
SRVQATFRHADAPS PQAFDEMLKN I QSLTS 




84 B 


269 


FVPAPRTVSGKRS LPGE WE ERGEGEQRTGRE FSGNGGRAVEAAR 
MRLL CG LWLWLS LLKVLQAQTPTPLPLPPPMQS FQGNOFQG EWF 
VLGIiAGNSFRP EHRALLNAFTATFELS DDGRFEVWNAMTRGQHC 
DTWS YVLX PAAQPGQFTVDHRVWTHEQAGR PQDQPAGQELVAAS 
RDAGPVHLPGQSSGPLG 


6451 


232" 


qiQ i 


HbPTPPTSPRASTMEDVKl,EFPSLPQCKEDAEEWTYPMRREMQE~~ 
ILPGLFLGP YS S AMKS KL P VLQKHGI THI I CI RQNI EANFI Kl»N 
FQQL FRYLVLD I ADNP VENI IRFFPMTKEFIDGSLQMGGKVLVH 
GNAG I SRSAAFVI AYIMETFGMKYRDAFAYVQERRFCINPNAGF 
VHQLQEYEAIYLAKLTIQMMSPLQIERSLSVHSGTTGSLKRTHE 


6452 
"6453 


1 


652 


RTRGESSNMEPLAA YPLKCSG 1 PRAKVFAVLLS IVLCTVTLFLLQ" 
LKFLKPfCINSFYAFEVKDAJCGRTVSLEKYKGKVSLWNVASDCQ 
LTDRKYXGLKELHKEFGPSHFSVLAFPCNQFGE3EPRPSKEVES 
FARKNYGVTFPIFHKIKILGSEGEPAFRFLVDSSKKEPRWNFWK 
r ij vjm ^ WKFWKPEE P I EVI RPD IAALVRQ V 1 1 KKKEDL 




827 


223 


HRRWLPGLSMSPRRTLPRPLSLCIjSLCLCLCLiAAALGSAQSGSC 
RDKKNCKWFSQQELRKRLTPLQYHVTQEKGTESAFEGEYTHHK 
DPGIYKCWCGTPLFKSETKFDSGSGWPSFHDVINSEAITFTDD 
FS YGMHRVETS CS Q CG AHLGH I FDDGPRPTGKR YC INS AALS FT 
PADSSGTAEGGSGVASPAQADKAEIi 




827 


223 


HRRWLPGIjSMS PRRTLPRPLSLCLSI*CIjCLCLAAALGSAQSGSC 
RDKKNCKVVFSQQELRKRIiTPLQYHVTQEKGTESAFEGEYTHHK 
DPGIYKCWCGTPLFKSETKFDSGSGWPSFHDVINSEAITFTDD 
FSYGMHRVETSCSQCGAHLGHIFDDGPRPTGKRYCINSAALSFT 
PADSSGTAEGGSGVAS PAQADKAEL 


6455 


1042 


173 


KVH^rVSASAAWDALGLPVRSHMQGSTRRMGVMTDVHRRFLQL 
LMTHGVLEEWDVKRLQTHCYKVHDRNATVDKLEDFINNINSVLE 
SLYIEI KRG VTBDDGRP I YALVNLATTS IS KMATDEAENELDLF 
RKALELIIDSETGFASSTNILNLVDQLKGIGCMRKKEAEQVLQKF 
/0NKWLIEKEGEFTLHGRAILEMEQYIRETYPDAVK1 CNICHSL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid SeQm&nt COntainino sional nonf IHp 

(A=Alanine, C=Cyeteine, D=?Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PsProline, Q=Glutamine, R»Arginine, 
S=Serine, T»Threonine, V«Valine, 
W«Tryptophan, YoTyrosine, X=Unknown, *=Stop 
Codon, Apossible nucleotide deletion, 
\=possible nucleotide insertion) 








LIQGQSCETCGIRMHLPCVAKYFQSttAJsiPRCPHCNDYWPHEIPK 
VFDPEKERESGVLKSNKKSLRSRQH 


6456 


2 


"555 


TVEAEAALQNKWALYFAAARC&PSRDFTPLLCDFYTALV7AEAR 
RPAPFEVVFVSADGSSQEMLDFMRELHGAWLALPFHDPYRHELR 
KRYNVTAI PKLVTVKQNGEVITNKGRKQIRERGLACPQDWVBAA 
DIFQNFSV 


6457 


23 


892 


PTTGFPVTNFPWNWPIXSKPPlMlLYVSKimilHFPDFDKKIPV 
KLFPLPLLWGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
± ILiOlvy x S>.UNI I US V rAI IJjGAFIAAGSDLAFNLEGYI FVFLND 
I FTAANGVYTKQKMDPKELGKYGVLF YNACFM 1 1 PTL IIS VSTG 
DLQQ ATE FNQWKNWF I LQFLLS C FLGFLLM YS TVLCS Y YNSAL 
TTAWGAI KNVS VAY I G I L IGG DY I FSLLNFVGLNI CMAGGLR Y 
SFLTLSSQLKPKPVGEENICLDLKS 


64SB 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVdKLMKXikFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTI PLTLLLET 
I ILGKQYSLNI ILSVFAI ILGAFIAAGSDLAFNLEGYI PVFLND 
IFTAANGVYTKQKMDPKELGKYGVLFYNACFMIIPTLI IS VSTG 
DLQQATEFNQWXNVVFILQFLIiS CFLGFLLMYS TVLCS Y YNSAL 
TTAWGAIKNVSVAYIGILIGGDYIFSLIiNFVGIjNICMAGGLRY 
SFLTLSSQLKPKPVGEENICLDLKS 


6459 


23 


892 


PTTGFPVT^FPWWMpbGkPPlMlLYVSKLNKIIHFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
1 1 LG KQ YSLNI I LSVFAI ILGAFIAAGS DLAFNLEGY I F V FLND 
IFTAANGVYTKQKMDPKELGKYGVLFYNACFMI IPTLI ISVSTG 
DLQQATEENQWKNWFI LQFLLS CFLGFLLMYS TVLCS YYNSAL 
TTAWGAI KNVSVAYIG ILIGGDYI FS LLNF VGLN I CMAGGLR Y 
SFLTLSSQLKPKPVGEENICLDLKS 


6460 


23 


892 


PTTGFPVTNFPWNWPr)GKPPIMiLYVskLNKIIHFPC>^5KKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
IILGKQYSLNI ILSVFAI I*LGAFIAAGS DLAFNLEGY I FVFLND 
IFTAANGVYTKQKMDPKELGKYGVLFYNACFMI IPTLIIS VSTG 
DLQOATEFNQWKNVVFILQFLLSCFLGFIiLMYSTVLCSYYNSAL 
TTAWGAI KNVSVAYIG I L I GGDYI FS LLNFVGLNI CMAGGLR Y 
SFLTLSSQLKPKPVGEENICLDLKS 


6461 


1653 


360 


LQQRTLRITAVGQTHPI AWMAWfe&SliGAFYGPAS FITFVNCMY F 
LS I FI QLKRHP ERKYELKEP TEEQQRLAANENG E INHQDSMSLS 
LISTSALENEHTFHSQLLGASLTLLLYVALWMFGALAVSLYYPL 
DLVFS FVFGATS LS FSAFFWHHCVNREDVRIAW IMT CCPGRS S 

GCKLTNLQAAAAQCHANSLPLNSTPQLDNSLTEHSMDNDI KMHV 
APLEVQFRTNVHS SRHHKNRS KGHRASRLTVLREYAYDVP TS VE 
GS VQNGLPKSRLGNNEGHSRS RRAYLAYRERQYNP PQQDSSDAC 
STLP KS S RKf FEKP VSTTS KKDALRKPAWELENQQKS YGLNLAI 
QNGP I KSNGQEGPLLGTDSTGNVRTGLWKHETT V 


4462 


3 


773 


SEELDREKKLKEDS PRKTPNKESGVPS LP VSLTS I KEEPKEAKH 
PDSQSMEESKLKNDDRKTPVNWKDSRGTRVAVSSPMSQHQSYIQ 
YLHAYPYPQMYDPSHPAYRAVS PVLMHS YPGAYLS PGFHYPVYG 
KMSGRE ETE KVNTS PS VNTKTTTESKALDLLQQHANQYRS KS PA 
PVEKATAEREREAERERDRHSPFGQRHLHTHHHTHVGMGYPL I P 
GQ YD P FQG LTSAALVASQQVAAQAS AS GM FPGQRRE 


6463 


2 


350 


VILCI LGGWI FKNADRSMEKKKGEPRTRAEARP WVDEDLKDS S D 
LHQAEBDADEWQESEENVEHIPFSHNHYPEKEMVKRSQEFYELL 
NKRRSVRFISNEQVPMEVIDNVIRTAGL 


6464 


12 


1154 


GILRQKEREERNRIHKKEILFLEHLLWPSEMSSLSGKVQTVLG 
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SEQ 
ID 

NO: 


j Predicted ~ ~ 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A- Alanine, C=Cysteine, D«Aspartic Acid, e= 
Glutamic Acid, P« Phenyl alanine, G=Glycine, 
H*Histidine, I*=Isoleucine, K=Lysine, 
L*Leucine, M=*Methionine, N=Asparagine , 
PsProline, Q=Glutamine, R=Arginine, 
SaSerine, T=Threonine, V«valine, 
W=Tryptophan, Y» Tyro sine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\opossible nucleotide insertion) 








IiVEPSKLGRTLTHBHLAMTFDCCyCPPPPCOBAlSKEPIVMKNL 
YWIQKNAYSHKBNLQLNQETEAIKEEIjLYFKANGGGALVENTTT 
G1SRDTQTLKRLABETGVHIISGAGFYVDATHSSBTRAMSVEQL 
TDVLMNB I LHG ADGTS I KOG 1 1 GE IGCS WPLTESERKVLQATAH 
AQAOIiGCPVIIHPGRSSRAPPQIIRILQBAGABISKTVMSHLDR 
TILDKKBLLEFAQLGCYLBYDL1TGTBLLHYQLGPDIDMPDDNKR 
IRRVRLLVEEGCEDRILVAHDIHTKTRLMKYGGHGYSHILTNW 
PKMLLRGITENVLDKILIENPKQWLTPK 


6465 


126 


1396 


iCMTVFFKTLRWHWKKTTAGLCLLTWGGHWLYGKHCDWLLRRAAd" 
QEAQVFGNQLIPPNAQVKKATVFLNPAACKGKARTLFEKNAAPI 
LHLSGMDVTIVKTDYEGQAKKLLELMENTDVI IVAGGDGTLQEV 
VTGVLRRTDEATFS KI P IG FI PLGETS SLSHTLFAESGNKVQH I 
TZ)ATLAI VKGETVPLD VLQ XKGEIOirQ P VFAMTGLRWGS FRDAG V 
KVSKYWYLEPLKIKAAHFFSTLKEWPQTHQASrSYTGPTERPPN 
EPEBTPVQRPSLYRRlIjRRIASYWAQPQDALSOEVSPEVWKDVQ 
LSTIELSI TTRNNQLDPTSKEDFLNI CIBPDT I SKGDF ITIGSR 
KVRNP KLKVEGTECLQASQCTLLI PEGAGGS FSIDSEB YRAMPV 
EVKLLPRKLQFFCDPRKREQMLTSPTQ 


6466 


1134 


828 


VARGTELSQLEKAHPPACMGRRKSKRKPPPKKKMTGTLETQFTC " 
PFCNHEKS CDVKMDRARirrGVlS CTVCLEE FQTPI TYLS E PVDV 
YSDW IDACEAANQ 


6467 


301 


2571 

r 


GELRVliALAHGELACHAVLTASLLSLRSRlWDSDMDYKRPN^T 
I KCWVGDNAVGKTRL I CARACKATLTQYQLliATHVPTVWAIDQ 
YRVCQEVLERSRDWDDVSVSLRLWDTFGDHHKDRRFAYGRSDV 
WLCFS I ANPNSLHHVKTMW YPEI KHFCPRAPVI LVGCQL DLR Y 
ADLEAVNRARRPLARP IKPNEILPPEKGREVAKELGI P YYETSV 
VAQFG IKDVFDNAI RAAL I SRRHLQFWKSHLRNVQRPLLQAP FX* 
PPKPPPPIIWPDPPSSSEECPAHLLEDPLCADVILVLQERVR1 
FAHKIYLSTSSSKFYDLFLMDLSBGELGGPSEPGGTHPEDHQGH 
SDQHHHHHHHHHGR DFLLRAAS FDVCES VDEAGG5 GPAGLRAST 
SDGILRGNGTGYLPGRGRVLSSWSRAFVSIQEEMAEDPLTYKSR 
LMWVKMDSS IQPGPFRAVLKYLYTGELDENBRDLMHIAH IAEL 
LBVFDLRMMVANI LNNE AFMNQEITKAFHVRRTNRVKECIiAKGT 
FSDVTFILDDGTISAHKPLLISSCDWMAAMFGGPFVESSTREW 
FPYTSKSCMRAVLSYLYTGMFTSSPDLDDMKLIILANRLCLPHL 
VALTEQYTVTGLMEATQ^VDrDGDVLVTLEIiAQFHCAYQliADW 
CLHHICTNYNNVCRKFPRDMKAMSPENQEYFEKHRWPPVWYLKE 
EDHYQRARKERE KEDYLHLKRQ PKRRWLFWNS PSS PS SS AAS S S 
SPSSSSAW 


6468 


3 


1374 


DAWAGTNMAALAPVGSPASRGPRIiAAGLRIJLJMLGLLOLIiAEPG " 

LGRVHHLALKBDVRHKVHLNTFX3FFKDGYMVVNVSSLSLNEPED 

KDVTIGFSLDRTKNDGFSSYLDEDVNYCILKKQSVSVTLL1LPI 

SRSEVRVXSPPEAGTQLPKI I FSRDBKVliGQSQBPNVNPASAGN 

QTQKTQDGGKSKRSTVDSKAMGEKSFSVHNNGGAVSFQFFFNIS 

TDDQEGLYSLYFHKCLGKBLPSDKFTFSUJIEITEKNPDSYLSA 

RFTPT.DTf TiVTCMfi ITPCTTT.CfiTTHTUTT Dim , DXTr\Trr«tf Tttnxf u* - 
uciiruriuii xiiortf r p r Ijov? 1 lWinllxKivKRNDVF K1HWLMAAL 

PFTKSLSLVFHAIDYHYISSQGFPIEGWAWYYITHLLKGALLF 

ITIALIGTGWAFIKHII^DKDKKIFMIVIPRRVLANVAYIIIES 

TEEGTTE YGLWKDSLFLVDLLCCGAILFPWWS IRHLQEASATD 

GKGKFSRAHFVLLSLL 


*469 


3 


1374 


D AWAGTNMAALA P VG S PAS RG P RLAAGLRLL PMLG IJjQL LA.EPG - " 
LGRVHHLALKDDVRHKVHLNTFGFFKIXSYMW^ 
KDVTIGFSLDRTKNDGFSS YLDBDVNYCILKKQSVS VTLLI LD I 
SRSEVRVKS PPEAGTQLPKI I FSRDEKVLGQSQEPNVKPASAGN 
QTQKTQDGGKSRRSTVDSKAMGEKSFSVHNNGGAVSFQFFFNIS 
TDDQEGLYSLYFHKCLGKEIiPSDKFTFSliDIEITEKNPDSYLSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 

tO LXVSt 

amino acid 
residue of 
amino acid 
sequence 


Ammo acxd segmenc containing signal peptide 
(A=Alanine, C=Cysteine, D«*Aspartic Acid, e= 
Glutamic Acid, ^Phenylalanine, G=Glycine f 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NsAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








GEIPLPKLYISMAFFFFLSGTIWIHILRKiU21©VFKIHWI<MAAL 
PPTKSLSLVFHAIDYHYISSQGFPIEGWAWYYITHLLKGALLF 
ITIALlGTGWAFIKHILSDKDKKIFMIVIPRRVLANVAYrilES 
TEEGTTE YGLWKDSL FLVDLLC CGAILFP WWS I RHLQE ASATD 
GKGKFSRAHFVLLSLL 


6470 


2726 1 


1437 


AAASGVS S RADAP VLAQ5 PAS AGNGRPSTP RVPGS RRH PS APRS 
G PL PREDGCRT PG PQLLPLPGALLR PRTLLS SAAETGRSRHPDT 
QHPSSGGRCRGGTESPSS^GRPASMAEAEEDCHSDTVRADDDE 
ENES PA BTDLQAQLQMFRAQWMFELAPGVSSSNLENR PCRAARG 
SLQKTSADTKGKQEQAKEEXARELFLKAVEEEQNGALYEAIKFY 
RRAMQLVPDIEFKITYTRSPDGDGVGNSYIEDNDDDSKMADLLS 
YFQQQLTFQESVLKLCQPBLESSQIHISVLPMEVLMYIFRWWS 
SDLDLRSLEQLSLVCRGFYICARDPEIWRLACLKVWGRSCIKLV 
PYTSWREMFLERPRVRFDGVYISKTTYIRQGEQSLDGFYRAWHQ 
VEYYRYIRFFPDGHVMMLTTPEEPQSIVPRLRTR 


6471 


1750 


203 


FFFDKMAAGGSGVGGKRSSKSDADSGFLGLRPTSVDPALRRRRR 
GPRNKKRGWRRLAQEPLGLEVDQFLEDVRLQERTSGGLLS EAPN 
E KL FFVDTGS KE KGLTK KRTKVQ KKSLLL KKP LR VDL I LENTS K 
VPAPKDVLAHQVPNAKKLRRKEQLWEKIJUCOGELPREVRRAQAR 
LLNPSATRAKPGPQDTVERP FYDLWASDNPLDRPLVGQD E FFLE 
O/rKKKGVKRPARLHTKPSQAPAVEVAPAGAS YNPS FEDHQTLLS 
AAHEVELQRQKEAEKLERQLALPATEQAATQES TFQELCEGLLE 
ESDGEGEPGQGEGPEAGDAEVCPTPARLATTEKKTEQQRRREKA 
VHRLRVQQAALRAARLRHQELFRLRG I KAQVALRLAELARRQRR 
RQARREAEADKPRRLGRLKYQAPD 1 DVQLS S ELTDSLRTL KPEG 
NILRDRFKSFORRNMIBPRERAKFKRKYKVKLVEKRAFRBIQL 


" *472 


3 


897 


SCGSDRAQWAMEFPFDVDALFPERITVLDQHLRPPARRPQTTfp 
ARVDLQQQIMTI IDBLGKAS AKAQNL3AP ITSAS RMQSNRHWY 
IIJCDSSARPAGKGAIIGFIKVGYKKLFVLDriREAHNEVEPLCIL 
DFYIHESVQRHGHGRELFQYMLQKERVEPHQLAIDRPSQKLLKF 
LNKHYNLETTVPQVNNFVIFEGFFAHQHRPPAPSLRATRHSRAA 
AVDPTPAAPARKLPPKRAEGDIKPYSSSDREFLKVAVEPPWPLN 
RAPRRATPPAHPPPRSSSLGJTSPERGPLRPFVP 


64/3 


22 


912 


SSAVEFWEGKKMAAEPNKTEIQTLFKRLRAVPTNKACFDCGAK 
NPS WAS ITYGVFLCI DCSGVHRSLG VHLS FI RSTELD3NWNWFQ 
LR CM Q VGGNANATAFFRQHGCTANDANTKYNS RAAQM YREK I RQ 
LGSAAlJUUiGTDLWIDNMSSAVPNHSPEKKDSDFFTEHTQPPAW 
DAPATEPSGTQQPAPSTESSGLAQPBHGPNTDLLGTSPKASLEL 
KSS 1 1 G KKKP AAAKKGLGAKKGLGAQKVSS QS FS E I ERQAQ VAE 
KLREQ QAADAKK QAE ESMVAS MRLA YQELQ IDR 


6474 


3 


4*2 


LQRQRQHPAAAPAVPVRCFTFCFTDI VI MPKRKS PENTEG KDG S 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKEEKQEAGKEGTAPS8NGETKABEIHISRSTVNVSTSRGTP 
PSTLSVKGQIETVRVKGTBN 


6475 


3 


462 


LQRQRQHPAAAPAVPVRCFTFCFTDIVIMPKRKSPENTEGKDGS " 

KGKKEEKQEAGKEGTAPSENGETKAEEIHISRSTVNVSTSRGTP 
PSTLSVKGQIETVRVKGTBN 


647* 


106 


1090 


ARAMAQ YKGTMREAGRAMHLIiKKRERQREQME VLKQR I AEET I L 
KSQ VDKR FSAH YDAVEABLKS STVGLVTIiNDMKARQEALVR ERE 
RQLAKRQHLEEQRIjQQERQREQEQRRERKRKI S CLSFAIiDDLDD 

qadaaearragnix5icnpdvdtsfi*pdrdreeeenrlr eelrqew 
eaqrekvkdeemevtfsywdgsghrrtvrvrkgntvqqflkkal 

QGLRKDFLELRSAGVEQLMFIKEDLILPHYHTFYDFI IARARGK 

sgplfsfdvhddvrllsdatmekdeshagkvvlrswyeknkhif 
pasrweaydpekkwdkytir 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal pepti<5e"~ 
(A»Alanine, OCysteine, D»Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
r-rtyxmc, u=v9-iucamxne , R=Argimne, 
S=Serine, T=Threonine, V«Valine, 
WoTryptophan, Y=Tyrosine, X-Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\ ^possible nucleotide insertion) 


6477 


227 


915 


LQGHI^GIMAASRPLSRFWEWGKWIVCVGRNYADHVRBMRSAVL 
SEPVLFLKPSTAYAPEGSPILMPAYTRNLHHBLELGWMGKRCR 
AVPEAAAMDYVGGYALCIiDWTAimVQDEaCKKGLPWTLAKS FTA 
SCPVS AFVPKEKIPDPHKLKLWLKVNGELRQEGETSSMI PS IP Y 
IIS YVSKI I TLEEGDI I LTGTPKGVG PVKENDE I EAG IHGL VS M 
TFKVEKPBY 


6478 


2 


1495 


FVSSRILPESLASSEASTLEAMGRKEEDDCSSWKKQTTNIRKTF 
I PMEVLGSGAFSE VFLVKQRLTGKLFALKCIKKS PAFRDS SLEN 
BIAVLKKIKHENIVTLEDIYESTTHYYLVMQLVSGGELFDRILE 
RGVYTE KDASLVIQQVIiS AVKYLHENGI VHRDLKPENLL YLTPE 
ENS KIM ITDFGLS KMEQNGIMS TACGTPG YVAPBVLAQKP YS KA 
VDCWSIGVITYILLCXJYPPFYEETESKLFEKIKEGYYEFBSPFW 
DDISESAKDFICHLLEKDPNBRYTCEKALSHPWIDGNTAI.HRDI 
Y PS VS LQ I QKNFAKS KWRO^FNAAAVVHHMRKL HMNLHS P G VRP 
EVENRPPETQASETSRPSSPEITITEAPVLDHSVALPALTQLPC 
QHGRRPTAPGGRSLNCLVNGSLHISSSLVPMHQGSIAAGPCGCC 
SSCLNIGSKGKSSYCSEPTLLFCKANXKQNFKSEVMVPVKASGSS 
HCRAGQTGVCLIM 


j 6479 


•3 


949 


SCRGPGWHPAGGQAGAMELLSALSLGELALSFSRVPLPPVFDLS ~" 
YF I VSI LYLKYEPGAVELSRRI IP IAS WLCAMLHCFGS YI LADLIi 
LGEPLIDYFSNNSSILLASAVWYLIPFCPIiDLFYKCVCFLPVKI, 
I PVAMKEWRVR KIAVG IHHAHHHYHHG WFVMI ATGWVKGS GVA 
LMSNFEQLLRGVWKP ETNE I LHMSFPTKASLYG AI LFTLQQTRW 
LP VS KASL I FI FTLFMVS CKVFLTATHSHSS PFDAUSGYI CPVL 
FGSACGGDHHHDNHGGSHSGGGPGAQHS AMPAKS KEE LS EG S RK 
KKAKKAD 


64 80 




514 


DFMS I YFPIHCPDYLRSAKMTEVMMNTQPMEEIGLSPRKDGLS Y 
QIFPDPSDFDRCCKLKDRLPSIWEPTEGEVESGELRWPPEEFL 
VQEDEQDNCEETAXENKEQ 


6481 


110 


1131 


KSRMDLDWNMFVIAGGTLAI PILAFVASFLLWPSALIR t YYWY^ 
WRRTLGMQVRYVHHEDYQPCYS FRGRPGHKPS I LMIiHG FS AHKD 
MWLSWKFLPKNLHLVCVDMPGHEGTTRSSLDDLSIDGQVKRIH 
QFVECLKIiNKKPFHLVGTSMG GQVAGVYAAY YPSDVS S LWLVCp 
AGtOYSTDNQFVQRLKELQGSAAVEKIPLIPSTPEEMSEMIiOLC 
SYVRFKVPQQILQGLVDVRIPHNNFYRKLFIjEIVSEKSRYSLHQ 
NMDK I KVP TQ 1 1 WG KQDQVLD VS GAD MLAKS IANCQVELLENCG 
HSWMERPRKTAKL I IDFLASVHNTDNNKKLD 


6482 


2517 


568 


epvskvsqsrrkagvptani^esoaveaamanvpwaevcekfqa 
alalsrvelhknpekepykskysaralleevkallgpapedede 
rpeaedgpgagdhalglpaewepegpvaqravrlaviefhlgv 
nhioteei^ageehlvkclrllrryrlshdcislciqaqnnlgi 
lwsereeietaqaylessealynqymkevgsppldpterflpbe 
eklteqbrs kkfekvythnlyylaqvyqhlbmfekaahychstl 
krqlbwnayhpiewainaatlsqfyinklcfmearhclsaanvi 
fgqtgkisatedtpeaegevpelyhqrkgeiarcwikycltlmq 
naqlsmqdnigeldldkqselralrkkeldeeesirkkavqfgt 

GELCDAISAVEEKVSYLRPLDFEEARBLFLLGQHYVFEAKEFFQ 
IDGYVTDHI EWQDHSALFKGIiAF FETDMERRCKMHKRRIAMLE 
PLTVDLNPQYYLLVIHIQIQFEIAHAYYDMMDIjKVAIADRLRDPD 
SHIVKKINNLNKSALKYYQLFLDSLRDPNKVFPEHIGEDVLRPA 
MLAKPRVARLYGKI ITADPKKEIiENIATSLEHYKFI VDYCEKHP 
EAAQEIEVELBLSKEMVSLLPTKMERFRTKMALT 


6483 


3 


623 


NSHLLCGLRARAPLSANGREARAMEQRLAEFRAARKRAGIJOiQP 
PAASO^AQTPGEKAEAAATLKAAPGWLKRFLVWKPRPASARAQP 
GLVQEAAQPO/jSTSETPWNTAIPLPSCWDQSFLTNITFLXVIiLW 
LVLLGLFVELEFGIAYFVLSLFYWMYVGTRGPBEKKE3EKSAYS 
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" SElQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A«Alanine, OCysteine, D=Aspartic Acid, E=* 
Glutamic Acid, Fa Phenyl alanine, G*Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine. N=Asparagine , 
r~firoj.ine, u-uiucamme, R=Argimne, 
S=Serine, Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X -Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\ap03sible nucleotide insertion) 
VFNPGCEAIQGTLTAEQLERELQLRPLAGR 


6464 


201 


965 


QLAVKTKMSGLRPGTQVDPEIELFVKAGSDGBSIG^CPFCQRLF 
MILWLKGVKFNVTTVDMTRKPBELKDLAPGTNPPFIjVYNKELKT 
DF I KIEE FLEQTLAPPRYPHLSPKYKESFDVGCNLFAKFSAYI K 
NTQKEANKNFBKSLLKEPKRLDDYLNTPLLDEIDPDSAEBPPVS 
RRLFLDGDQLTLADCSLLPKLNI I KVAAKKYRDFD I PAE FSG VW 
RYLHNAYAREEFTHTCPEDKE I ENTYANVAKQKS 


64B5 


6 


1091 


FVDL VRAVEFL PCPDSQKLEKE CQSS EESMGSNSMRS I LEEDEE 
DE E P PRVLLYHEPRS F EVGMLVWHKHK KY P FW P AWKS VRQRD K 
KASVLYIEGHMNPKMKGFWSLKSLKHFDCKEKQTLLNQAREDF 
NQ D I GWCVSL ITDYRVRLGOGS FAGSFLB YYAADI S Y P VR KS I Q 
QJD VLGTKLPQ LS KGS P EE P WG CPLGQRQPCRKMLPDRS RAARD 
RANQKLVEYIGKAKGAESHIiRAlIiKSRKPSRWLQTFLSSSQYVT 
CVETYLEDEGQLDLVVKYLQGVYQBVGAKVLQRTNGDRIRFILD 
VLL PEAI I CA I S AGDE VD YKTABE KYIKGPSLS YRE KE I FDNQL 
LEERNRRRR 


6486 


10 


581 


LVLQAGGAHLSPSRVTQXil^VMlAFSEMPKPPDYSELSDSIiTLA 
GGTGRFSGPLHRAWRMMNFRQRMGW IG VGLYLLASAAAFYYVFE 
ISETYNRLALEHIGXJHPEEPIiEGTTWTHSLKAQLIiSLPFWVWTV 
I FLVP YLQM FLFLYS CTRADPKTVGYC 1 1 P I CLAVI CNRHQAFV 
KASNQISRLOLIDT 


6487 * 


352 


863 


SFLKPLRGKMS VTLHTDVGDIKIEVFCERTPKTCENFIALtjASW' " 
YYNGCIFHRNI KGFMVQTGDPTGTGRGGNS 3WGKKFEDEYSEYL 
KHNVRG VVSMANNGPNTNGSQFFI TYGKQPHL DMKYTVFGKVID 
GLETLDELEKLPVNEKTYRPLNDVHI KDITIHANPFAQ 


6488 


878 


241 


TALQE FGTSG P P LS LR FALP SQTGRFKPIiFQARG PS W P P S PRVP 
MEPPNLYPVKLYVYDLSKGLARRLSPIMLGKQLEGIWHT5IVVH 
JtoEFFFGSGGISSCPPGGTLU3PPDSWDVGSTEVTEEiFLEYIi 
SS LGESLFRGEAYNLFBHNCNTFSNEVAQFIiTGRKIPSYITDLP 
SEVLSTPFGQALRPLLDS I Q I QP PGGSS VGRPNGQS 


" 6489 


1457 


375 


KVAKMATALSEEELDNEDYYSIJjNVRREASSEEIiKAAYRRIjCML 
YHPDKHRDPELKSQAERLFNLVHQAYEVLSDPQTRAIYDIYGKR 
GLEMEGWE WERRRTPAEI REEFERLQREREERRLQQRTNPKGT 
ISVGVDATDLFDRYDEEYEDVSGSSFPQIEINKMHISQSIEAPL 
TATDTAI LSGSLSTQNGNGGGS INFALRRVTSAKG WGELE FGAG 
DLQGPLFGLKLFRNLTPRCFVT1NCALQFSSRGI RPGLTTVLAR 
NLDKNTVGYLQVmCSSPLLQVQRPHRNTRACAPEPSFRPFLHVP 
TWDABCSGARTPSTAWTSAAVKIiREACLSGPGSGSHQLLLLTPR 
SKRRTGGG 


6490 


3 


1183 


HEAGCEVWLGYGPRAAAAAAATVLFGGAQPTETMFVARSIAADH " 
KDLIHD VS FDFHGRRMATCSS DQS VKVWDKSESGDWHCTAS WKT 
HSGSVWRVTWAHPEFGQVLAS CSFDRTAAVWEE I VGESNDKLRG 
QSHWVKRTTLVDSRTSVTDVKFAPKHMGLMbATCSADGIVRIYS 
APDVMNLSQWS LQHEI SCKLS CSCISWNPSSSRAHSPMIAVGSD 
DSS PNAMAKVQ I FE YNENTR KYAKABTLMTVT D PVHDI AFA PN L 
GRS FHI LAI ATKDVR I FTLKPVRKELTS SGGPTKFE I HXVAQFD 
NHNS QVWRVSWNI TGTVLAS SGDDGCVRLWKANYMDNWKCTG I L 
KGNGSPVNGSSQQGTSNPSLGSNIPSLQMSLNGSSAGRKHS 


6491 


3 


1183 


HEAGCEVWLGYGPRAAAAAAATVLFGGAGPTETMFVARSIAADH 
KDLIHDVSFDFHGRRMATCSSDQSVKVWDKSESGDWHCTASWKT 
HSGSWJRVTWAHPEFGQVLASCSFDRTAAVWEEIVGESNDKLRG 
QSHWVKRTTLVDSRTSVTDVKFAPKKMGLWIjATCSADGIVRIYE 

apdvmnlsqwslqheiscklscsciswnpsssrahspmiavgsd 
dsspnamakvqifeynentrkyakaetlmtvtdpvkdiafapnl 
grsfii i la iatkdvri ftlkpvrkelts sgg ptkfe ih i vaqfd 

NHNSQVWRVS WN I TGTVLAS S GDDGCVRLWKANYMDNWKCTGI L 
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(A«Alanine, C«Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PaProline, Q=Glut amine, R»Arginine, 
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WoTryptophan, YoTyrooine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 


6492 


24 


2573 


KGNGSPVNGSSQQGTSNPSIiGSNIPSLQNSLNGSSAGRKHS 

IPFLKSCCCCCLFDFPPPPLDdVQEEECEVBRVTEHGTPKPFRk 
PDSVAFGESQSEDEQFENDLETDPPNWQQLVSREVLLGLKPCBI 
KRQEVI NE LF YTERAHVRTLKVLDQVFYQRVS REGILSPSELRK 
IFSNLEDILQLHIGLNEQMKAVRKRNBTSVIDQIGEDLLTWFSG 
PGEEKLKHAAATFCSNQPFALEMIKSRQKKDSRFQTFVQDAESN 
PLCRRLQLKDIIPTQMQRIjTKyPLLLDNIATYTEWPTBREKVKK 
AADHCRQILNYVNQAVKEAENKQRLEDYQRRLDTSSLKLSBYPN 
VEBLRNIiDLTKRKMIHEGPLVWKVNRDKTIDLYTLLLBDILVLL 

KAL FV I SMS DNGAQ I YE L VAQTVS E KT VWQD L I CR MAAS VKE QS 
TKP I PLPQS TPGBGDNDEED PS KLKEEQHGI S VTGLQS PDRDLG 
LESTLISSKPQSHSLSTSGKSEVRDLFVAERQFAKEQHTDGTLK 
EVGED YQI A I PDSHLP VS EERWALDALRNLGLLKQLIiVQQLG LT 

EKSVOEl)WOHPPRYRTA^OGPnTnQVTftTJQPTNTTTriiViJor«rv'tiMr» 

FRTGTGDIATCYSPRTSTESFAPRDSVGLAPQDSQASNILVMDH 
MIMTPEMPTMEPEGGLDDSGEHFFDAREAHSDBNPSEGDGAVKK 
EE KD VNLRI S GNYL I LDG YD P VQES STDEEVAS S LTLQPMTG I P 
AVBS THQQQHSPQNTHSDGAIS PFTPEFLVQQRWGAMEYSCFEI 
QSPSSCADSQSQIMEYIHKIBADLEHLKKVEESYTIIiCQRLAGS 
ALTDKHSDKS 


6493 


557 


1147 


TPARMAYQGSSTSDCMSKTIiDSASAHFAASAVVSAPVPSRS EVA " 
KEQNTGHNNINGWQPSGTS KTL YS TNMALS SS PGI S AVQLVRT 
VGHTTTNHLI PALCTSSPQTLPMNNSCLTNAVHLNNVSWS PVN 
VHINTRTSAPS PTALKLATVAASMDR VPKVTPSS AI S S IARENH 
EPERI*GI^GIAETTVAMEVT 


6494 


2425 


1052 


AVAGGARPCSTPSSPHRRCRRHRPRPLPRPPAAIMSASAVYVLD 
LKGKVLICRNYRGDVDMSEVEHFMPIIiMEXEEBGMLSPILAHGG 
VRFMW I KHNNL YLVATSKKNAC VS LVFS FLY KWQVFS E Y FKE'L 
BEES IRDNFVI I YELLDELMDFGYPQTTDSKILQBYITQEGHKI, 
BTGAPRPPATVTNAVSWRSEG I KYRKNEVFLDVIBSVNLEVSAN 
GNVLRSEIVGSIKMRVFLSGMPELRLGLNDKVLFDNTGRGKSKS 
VELEDVKFHQCVRLSRFENDRTISFIPPDGEPELMSYRliNTHVK 
PLIWI ES VT EXHSHSRI EYMI KAKSQFKRRSTANNVE IHIPVPK 
DADSPKFKTTVGSVKNVPENSEIVWSI KSFPGGKEYLMRAHFGL 
PS VEAEDKEGKPPI5VKFEIPYFTTSGIQVRYI*KI IEXSGYQAL 
PWVRYITQNGDYQLRTQ 


6495 


2425 


1052 " 


AVAGGARPCSTPS S PHRRCRRHR PRPLPRPPAAlttS ASAVYVLD 
LKGKVL I CRN YRGD VDM SEVE HFM P I LME KE E EGMLS p ILAHGG 
VRFMWIXHNNLYLVATS KKNACVSL VPS FLYJCWQVFSEYFKBL 
EEESIRDNFVIIYELLDEIlflDFGYPQTTDSKILQEYITQEGHKL 
ETGAP R P P ATVTNAVS WRS EG IKYRKNEVFLDVIES VNLL VSAN 
GNVLRSEIVGSIK^VFLSGMPEIjRijGIiNDKVLFDNTGRGKSKS 
VELEDVKFHQCVRLS R FENDRT I S F IPPDGEFEI44S YRLNTHVK 
PLIWIESVIEKHSHSRIEYMIKAXSQFKRRSTANNVEIHIPVPN 
DADS PKFKTT VGSVKWVPENS B I VWS IKS FPGGKEYLMRAHFGL 
PS VEAE DKEG KPP I S VJKFE I P Y FTTSGI QVRYIjKI IB KSGYQAL 
PWVRYITQNGDYQLRTQ 


6496 


247 


559 


LRAVSLLPLQLVLPEYS IHSLFCIMFLCAQEWLTLGLNVPULfv 

HFWRYFHCPADSSELAYDPPVVMNADTLSYCQKEAWCKliAPTLL 
SFFYYLYCMIYTLVSS 


6497 


1053 


352 


ANTQICRLCPRRHLHPPCGAKMGNGTEEDYNFVFKVVLIGESGV 
GKTNLLSRFTRNEFSHDSRTTIGVEFSTRTVMLGTAAVKAQI WD 
TAGLER YRAI TS AYYRGAVGALLVFDLTKHQT YAWERVJLKE LY 
DHAEAT 1 WMLVGNKSDLS QAR E VPTEBARMFAENNGLLFLE TS 
ALDSTNVELAFETVLKEIFAKVSKQRQNSIRTNAITLGSAQAGQ 
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W=Tryptophan, Y=< Tyrosine, X=Unknown, +«Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








EPGPGEKRACCISL 


6498 


2636 


272 


SLRLCPMGTHIiAGPTTMRLSSLLALLRPALPLILGLSLGCStst 
LRVSWIQGBGEDPCVEAVGERGGPQNPDSRARLDQSDEDFKPRI 
VP YYRDPNKP YKKVLRTR Y I QTELGS RERLLVAVLTSRATLSTL 
AVAVNRTVAHHFPRLLYPTGQRGARAPAGMQVVSHGDERPAWIiM 
SETLRHLHTHFX3ADYDWPPIMQDDTYVQAPRIiAAIiAGHL9 INQD 
L YLGRAEE FI GAGEQAR YCKGG FG YLLSRS LL LR LRPHLDG CRG 
DILSARPDEWLGRCLI DSLGVGCVSQHQGQQYRS FELAKNRDPE 
KEGS SAFLSAFAVHPVS EGTLMYRLHKRFSAL ELBRAY S E I EQL 
QAQIRNI,TVLTPEGEAGLSWPVGLPAPFTFHSRFEVLGWDYFTB 
OHT FS CADGAP KC PL QGASRAD VG DALETALEQLNRRYQPRLRF 
QKQRLLNGYRRFDPARGMEYTLDLLLECVTQRGHRRALARRVSL 
LR PLS R V£ 1 LPMP YVTEATR VQLVLPLLVAE AAAAPAFLE AFAA 
NVLE PREHALLTLLLVYGPREGGRGA P DPFJU3 VKAAAAELERR Y 
PGTRIAWLAVRAEAPSQVRXnMDWSJOCHPVDTLPPLTTVWrRPG 
PE VLNRCRMNAI SGWQAFFP VHFQBFNPALS PQRS PPGPPGAG P 
DPPS P PGADPSRGAP IGGRFDRQAS AEGCFYNADYLAARARLAG 
ELAGQEEEEAL EG LEVMDVFLRPSGL H L FRAVE PG LVQKFS LRD 
CS PRLS^LYHROU^NLEGLGGRAQLAMALFEQEQANST 


6499 


3 


2040 


SCS ADTRPS GQ AW PTVGLRAAAGAFRTGS PLALGPETPQVACLP 
GHPP VRPQVSGGPGAMPDPAAHLP FF YGS I S RAEAEEHIiKliAGM 
ADGLFLLRQ CLRSLGG YVLSLVHDVRFHHFP I ERQLNGTYA IAG 
GKAHCG P AELCE F YSRDPDGLPCNL RKPCNR P SGLEPOPG VFDC 
LRDAMVRDYVRQTWKLEGEAIiEQAI i sqapq vekl iattaherm 
PWYHSSLTREEAERKLYSGAQTDGKFbLRPRKEQGTYALSLIYG 
KTVYHYLISQDKAGKYCI PEGTKFDTLWQLVE YLKLKADGLI YC 
LKEACPNSSASNASGAAAPTLPAHPSTLTHPQRRIDTLNSDGYT 
PEPARITSPDKPRPMPMDTSVYESPYSDPEELKDKKLFLKRDNL 
LIAD I ELGOGNFGS VRQGVYRMRKKQ I DVAIKVLKQGTEKADTE 
EMMREAQIMHQUSNPYIVRLIGVC^AEAIaMLVMEMAGGGPb 
LVGKREE I PVSNVAELLHQVS MGM KYLEE KNFVHRDLAARNVLL 
VNRHYAKISDFGLSKALGADDSYYTARSAGKJ^LKWYAPECINF 
RKPSSRSDVWSYGVTMWBAIiSYGQKPYKKMKGPEVMAFIBQGKR 
MECPPECPPELYALMSDCWIYKWEDRPDFLTVEQRMRACYYSLA 
S KVEGPPGSTQKAEAACA 


6500 


iT73 


72* 


TGPTHASADAWGLVRS VTE WCANVRGNPGAAALS CPQAVLDAGK 
MLSESS S FUCGVMLGSIFCAL ITMLGH I R IGHGNRMHHHEHHHL 
QAPNKED I LKISEDERMELSKS FRVYC I 1 LVKP KDVSLWAAVKE 
TWTKHCDKAEFFSSENVKVFES INMDTNDMMLMMR KAYKYAFDK 
YRDQYWWFFLARPTTFAIIENLKYFLLKKDPSQPFYLGHTIKSG 
DLEYVGMEGGIVIjSV3SMKRLNSLLNIPEKCPEQGGMIWKISED 
KQLAVCLKYAGVFAENAEDADGKDVFNTKSVGLS I KEAMTYHPN 
QVVEGCCSDMAVTFNGLTPNQMHVMMYGVYRLRAFGPYFQ 


65*01 


1 


570 


LVGMSGGGTETP VGCEAAPGGG S KKRD S LGT AG S AH L 1 1 KD LGE 
IHSRL LDHR PVIQGETR YFVKEFE EKRGLREMR VLENLKNM X HE 
TNEHTLP KCRDTMRDSLSQVLQRLQAANDSVCRLQQREQERKKI 
HS DHLVAS EKQHMLQWDNFMKE QPNKRAEVDEEHRKAM BRLKEQ 
YAEME KDLAKFSTF 


6502 


213 


1**0 


AGNKPDP WAGRNRTAVLPDVS VFHREDVG WWRSWLQQ S YQAVKE 
KSSEALE FMKRDIiTEFTQWQHDTACT IAATASWKE KLATEGS 
SGATEKMKKGLSDFLGVISDTPAPSPDKTIDCDVITLMGTPSGT 
AEPYDGTKARIiYSLQSDPATYCNEPDGPPELFDAWLSQFCIiEEK 
KGBISELLVGSPSIRAIiYTKMVPAAVSHSEFWHRYFYKVHQLEQ - 
EQARRDALKQRAEQSISEEPGWEEEEEELMGISPISPKEAKVPV 
AXISTFPEGEPGPQSPCEENLVTSVEPPAEVTPSESSESISIiVT 
QIANPATAPEARVLPKDLSQKLLEASLEEQGLAVDVGETGPSPP 
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Amino acid segment containing signal peptide 
(A^Alanlne, Co Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
lUHistidine, Islsoleucine, KsLysine, 
L= Leucine, K=Methionine, N»Asparagine , 
PcProline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X -Unknown, *-stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








IHSKPLTPAGHTGGPEPRPPARVETLRBEAPTDLRVFELNSDSG 
KSTPSNNG KKGS S TD I S ED WEKDFDLDMTEEE VQKALS KVDAS G 
EVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


6503 


213 


1650 


AGNKPDPWAGRNRTAVLPDVSVFHREDVGVWRSWLQQSYQAVKE 
KS S SALE FMKRDLT E PTQWQHDTACT I AATAS WKEKLATE GS 
SGATEKMKKGLSDPIiG VI SDTFAPS PDKTIDCDVITLMGTPSGT 
AEP YDGT KARL YSLQSD PATYCNEPDGPPELFDAWLSQ FCLE EK 
KGBI SELLVGS PS I RAL YT KMVP AA VSHS E FWHRYFYKVH QLEQ 
EQARRDALKQRAEOSISEEPGWEEEEEKLMGISPISPKEAKVPV 
AKI STFFEGEPGPQS PCEKNLVTS VEPPABVTPSESSESISLVT 
Q I ANP ATAPEAR VLP KDLSQKLLEAS LEEQGLAVDVGETG PSPP 
IHSKPLTPAGHTGGPE PRPPARVETLREBAPTDLRVFELNSDSG 
KSTPSNNGKKGSSTDI SEDWEKDFDLDMTBEEVQMALSKVDASG 
EVSGPGGSEGSEPNG PGCESSPQPAQLS PQEGPCSCLR 


S504 


2131 


1294 


GKVCLVAHW VCLS ILS P P P AGM KT P NAQEAEG QQTRAAAGRATG ~ 
S ANMTKKKVSQKKQRGRPSSQP CRN I VGCRI S HG WKEGDE P I TQ 
WKGTVLDQVPINPSLYLVKYDGIDCVYGLELHRDERVLSLKILS 
DRVAS S H IS DANLANT I IGKAVEHMFEGEHGS KDEWRGMVLAQA 
PIMKAWFYITYEKDPVLYMYQLLDDYKEGDLRIMPESSESPPTE 
REPGGWDGLIGKHVEYTKEDGSKRIGMVIHQVEAKPSVYFIKF 
DDDFH I YVYDLVKKS 


6505 


21*1 


1294 


GKVCLVAHWVCLSILSPPPAGMKTPNAQEAEGQQTRAAAGRATG " 

SANMTKKKVSQKXQRGRPSSOPCRNIVGCRISHGWKEGDEPITQ 

WKGTVLDQVPINPSLYLVKYDG1DCVYGLELHRDERVLSLKILS 

DRVAS SHIS DANLANT I IG KAVEHM FEGEHGS KDEWRGMVLAQA 

PIMKAWFYITYEKDPVLYMYQLLDDYKEGDLRIMPESSBSPPTE 

REPGGVVDGLIGKHVEYTKEDGSKRIGMVIHQVEAKPSVYFIKF 

DDDFH I YVYDLVKKS 


6506 ' 


1 


1350 


EVS P PTSCCLT VAVADPGVS EGFRGFGAG CEMPGRGRCPDCGST 
ELVEDSHYSQSQLVCSDOGCWTEGVLTTTFSDEGNLREVTYSR 
3TGENEQVSRSQQRGLRRVRDLCRVLQLPPTFEDTAVAYYQQAY 
RHSGIRAARLQKKEVLVGCCVLITCRQHNWPLTMGAICTLLYAD 
LDV PS S T YMQ I VKLLGLD VPS LCLAELVKTYCS S FKLFQAS PS V 
PAKYVEDKEKMLSRTMQLVELANETWLVTGRHPLPVITAATFIA 
WQSLQPADRLS CSLARFCKLANVDLPYPASSRLQELLAVLLRMA 
EQLAWLRVLRLDKRS VVKHI GDL LQHRQS LVRS AFRDG TAB VET 
REKEPPGWGQGOGEGEVGNNSLGLPQGKRPASPALLLPPCMLKS 
PKRICPVPPVSTVTGDENISDSEIEQYLRTPQEVRDFQRAQAAR 
QAATSVPNPP 


ODU/ 


1878 


929 


RSHASRLP ELPSGCL VLQVQELVQMSGMEATVT I P I WQNKPHGA 
ARSVVRRIGTNLPLKPCARASFETLPNISDLCLRDVPPVPTLAD 
I AWIAADEEETYARVRSDTRPLRHTWKPS PLIVMQRNASVPNLR 
GSEERLLALKKPALPALSRTTELQDELSHLRSQIAKIVAADAAS 
ASLrPDFLSPGSSNVSSPLPCFGSSFHSTTSFVISDITEETEVE 
VPELPSVPLLCSASPECCKPRHicAar*? Q^RTrnnrvQT.Q ir»c o rn 

DMMGILKDFHRMKQSQDLNRSLLKEEDPAVLISEVLRRXFALKE 
EDISRKGN 


6508 


862 


342 


WEARKRPQRWPSERREVRVPPPHLQRGRSGLEPGTPRKMAAARP 
SLGRVLPGSSVLFLCDMQEKFRHNIAYFPQIVSVAARMLKNTTL 
DLLDRGLQ VH VWDACS S RSQVDRLVALARMRQSGAFLS TSEGL 
I LQLVGDAVHPQ FKE I QKL I KE PAPDSGLLGLFQGQNS LLH 1 


6509 


2 


1053 


fvivnprggrkrrrqaavtqaatrasgtpsprdgtmtOgklsvan 
kapgtegqqqvhgekkeapavp sap psybeatsgegmkagappp 
aptavplhps wayvdpss s ss ydng fptgdkel fttfs wddqkv 
rr vfvr kvyt i lliqll vtlawal ftfcdp vkdyvqan pgwy w 
asyawfatyltlaccsgprjrhfpwnlilltvftlsmayltgml 
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Amino acia segment containing signal peptide 
(A= Alanine, C»Cysteine, D^AsDartic Ae4d tp«, 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=*Isoleucine. K» Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P«Proline, Q=Glutamine, R»Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, ^Tyrosine, X«Unknown, *-stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








S3YYNTTSVLLCliGITALVCI^VTVF^FfyrygnpT»QfvvaTrr — 
inu»u«\( i v b o r \ft ftr uc a oCQQVLFVLi 

LMTLFFS GL I IiA I LL PFQYVP WLHAVYAALGAGVFTLPLAIiDTQ 

LLMGNRRHSLSPEEYlFGAIJ^IYXiDIiytprPPLQLFGlNRE 


6510 
*511 


37 


1156 


PCALDGCPQRGAVH PliLSSAMGLLAPLKTQ FVLHL LVG FVFVVS 
GLVINFVQLCTLALWPVSKQLYRRLNCRLAySLWSQLVMLLEWW 
SCTBCTLFTDQATVERFGKBHAVI ILNHNPE IDFLCGWTMCERF 
G VXGSSKVLAK KB LLYVPL I GWTWYFLEXVFCKRK WE EDRDTVV 
EGLRRLSDYPEYMWFliLyCEGTRFTETKHRVSMEVAAAKGLPVL 
KYHLLPRTKGFTTAVKCLRGTVAAVYDVTLHFRGNKNPSLLG1L 

YGKKYEADMf!VT?RT?D7 .PHTTDT nff V?H hnur irvr vnn vni t — 

luvnnr luiUU x ir JjjJgJUSw^jWLIlKIixQBKDA t -OK T Y 

NQKGMFPGEQFKPARRPWTLLNFLSWATILLSPLFSFVLGVFAS 
GS PLLILTFLGFVGAGNGHCR 


6512 


2541 


1425 


GEEQPIAAAPTECLEQVIGGAGDPGTWASFPSPLPGPAPIiKGGK 
TMATNFSDIVKQGYVKMKSRKLGIYRRCWLVFRKSSSKGPQRLB 
n.x ruc**J3 v ^^K^U^KVTEISNVKCVTRI»PKETKRQAVAI I FTDD 
! S ART FTCDS ELEAEE W YKTLS VECLGSRLND I S LGE PDL LAPG V 
QCEQTDR FNVFLLPCPNLDVYGE CXLQI THEN I YLWD I HN PR VK 
L VS WPLCS LRR YGRDATR FTFBAGRMCDAGEG L YTFQTQ E G EQ I 
YQRVHSATIiAIAEQKKRVLLEMEKNVRLIiNKGTEHYSYPCTPTT 
MLPRSAYWHH I TGSQNIAE AS S YAGEGYGAAQAS SETDLLNRF I 
LLKPKPSQGDSSEAKTPSQ 


" 6*13 


159 


807 


FGKKSTWFPl»SRSIiRVASGRSCKLGHGGYTGSGPGFGEPRDSGA 
EVPSGSGRATGCERGGVRGARQGRAPGSSIWRKBPRMVCTRKTK 
TLVSTCVILSGMTNI I CLLYVGWVTNY1ASVYVRGQE PAPDKKL 
EE DKGDTLKI I ERIjDHLENVI KQHIQBAPAKP EE AEAE P FTDS S 
J, «" ,J w»"«^ii«<<itv/ujxv,w'V* *«riNAYI*SDRLPLDRP 




2 


756 


FVS PE PGF S IiAOLNL I WQLTDTKQLVHS FAEGQDQX3SA YANRTA 
L F PDLIiAQGNASIiRLQRVRVADEG S FTCFVS I RD FGS AA VS LQ V 
AAP YS KPSMTLEPNKDIjRPGDTVT itcssyqg ypeaevfwqdgq 
gvpltgnvttsqmaneqglfdvhs ilrwlgangt ys clvrnpv 
* lcqdahs svtitpqrs ptgavevqvp edpwalvgtdatlrcs f 

arsrurouHVUHijiwyL iUI KQjjVHSFAEGQDQGSAYANRTALF 
PDLLAQGNT^SLRLQRVRVADEGS FTCFVSIRDFGSAAVSLQVAA 
P YS KPSMTXE PNKDLRPGDTVTITC9S YQGY PEAEVFWQDG QG V 
PLTGNVTTSQMANEQGLFDVHS 1 1>R WLGANGTYS CLVRNP VLQ 
QDAHSSVTITPQRSPTGAVEVQVPEDPWALVGTDATLRCSFSP 
EPGFSLAQLNLIWQLTDTRQLVHSFTEGR 


6514 


985 


302 


VGIPGPTISSAAEMBDLLDI^EEIJIYSIATSRAKMGRRAQOSSA 
Q AENHLNGKNSS LTLTGETSSAKLPRCRQGGWAGDS VKASKFRR 
KASEEIEDFRI*RPOSLNGSDYt5GDTPTTDnT tttjotvpcimjt/t rwr*. 
APPS I Q I KRVMT YRDLDNDLMKYSAIQTLDGE I DLKTiLTKVLAP 
EHEVRERNPSWQDDVGWDWDHLFTEVSSEVLTEWDPLQTEKEDP 
AGQARHT 


6515 


1345 


305 


GRVGS RRRGAAV PGGCGAGS TQLE VS AS AS C3GALGS Ai)MNP 1 W 
VHGGGAGPISKDRKERVHQGMVRAATVGYGILREGGSAVDAVEG 
AWALEDDPEFNAGCGSVLNTNGEVEMDAS IMDGKDLS AG AVS A 
VQCIANPIKLARLVMEXTPHCFLTDQGAAQFAAAMGVPEIPQEK 
LVTEP^KKRI^KEKHEKGAQKTDCQKNLGWGAVALDCKGNVAY 
AT3TGG I VNKMVGR VGDS PCLGAGGYADNDIGAVS TTGHGE S I h 
KVNLARLTLFHIEQGKTVEEAADLSLG YMKSRVKGIiGGLI WS K 
rGDWVAKWTSTSMPWAAAKDGKLHFGIDPDDTTITDLP 


6516 


1 


1402 


FRRLRYLGQDATAAARDLRTRGLQGYCPSATARQQVLVSAL-QQL 
KGRRSEHRNENQBMPYSTNKELI)LGIMVGTAGISLLLIiWYHKVR 
KPGI AMKLPEFLS LGNTFNS I TXiQDE I nDDQGTTV I FQERQLQ I 
LiEKLNELLTNMEELKEEIRFLICEAIPKLEEYIQDEIjGGKITVHK 
rSPQHRARKRRLPTIQSSATSNSSEEAESEGGYITANTDTEEQS 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A^Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenyl alanine, G»Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
**~ * **c, i i-ncLAionine , N^Asparagine, 
PaProline, Q=Glut amine, R=Arginine, 
SaSerine, T=Threonine, V»Valine, 
WoTryptophan, Y=Tyrosine, X«Unknovn, *=Stop 
Codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion) 








FPVPKAFNTRVBSLNLDVLLQKVDHLRMSESGKSESFEtLRDHK ~ 
EXFRDE I E FMWRFARAYGDM YELSTNTQBKKHYAN IGKTLS ERA 
INRAPMWGHCHLWYAVLCGYVSEFEGliQHKINyGHIiFKEHLDlA 
IKLLPEEPFLYYLKGRYCYTVSKLSWIEKKMAATLFGKIPSSTV 
QEALHNFLKAEELCPGYSNPNYMYLAKCYTDLEBNQKAIiKFCNL 


6517 
! 65ia 




1414 


GRVWGGSSSLNAMVYVRGHAEDYERWQRQGARGWDYAHCLPYFR " 

KAQGHELGASRYRGADGPLR VS RGKTNHPLH CAFLEATQQAG YP 

LTEDMNGFQQEGFGWMDMTIHEGKRWSAACAYLHPALSRTNLKA 

EAErLVSRVLFEGTRAVGVEYVKNGQSHRAYASKEVlLSGGAIN 

S PQLLMLS G IGNADDLKKLG I PWCHLPGVGQNLQDHI,EIYIQQ 

ACTRPITLHSAQKPLRKVCIGLEWLWKFTGEGATAHLETGGFIR 

S QPGVPHPDIQFHFL PS QVI DHGRVPTQQEAYQVHVGPMRGTS V 

GWLKLRSANPQDHPVIQPNYLSTBTDIEDFRLCVKLTREIFAQE 

ALAPFRGKELQPGSHIQSDKEIDAFVRAKADSAYHPSCTCKMGQ 

PSDPTAWDPQTRVLGVENZiRWDASIMPSMVSGNLNAPTIMlA 

EKAADIIKGQPALWDKDVPVYKPRTLATQR 




242 


1098 


PAWNPGSEPRTRVRPRARSFPLPPPRAPRRRRHRLLRAVPGPSR 
RHRCRRRAPPPPSTMGDAGSBRSXAPSLPPRCPCGPWGSSKTKN 
LCSKCFADFQKKQPDDDSAPSTSNSQSDLFSEETTSDNNNTSIT 
TPTLS PSQQPIiPTELNVTS PS KEECGPCTDTAHVSLI TPTXRSC 
GTDSQSENEAS P VKRP RLLENTERS EETS RS KQKSRRRC FQ CQT 

KtiELVQQELGSCRCGYVFCMLHRLPEQHDCTFDHMGRGREKAIM 
KMVKLDRKVGRS CQR I GEGCS 


6519 


3 


1113 


ERKMAEPPS PVHCVAAAAPTATVSEKEPFGKIjQLSSRDPPGSIiS 
AXKVRTEEKitAPRRVNGEGGSGGNS RQLQP PAA PS PQS YGS PAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLL 
VPP TLLHAQPHHUiLPAAAAAAS ANAKSRRPKEKREKE RRRHGL 
GGAREAGGASREENGE VKPLPRDKI KDKIKERDKEKEREKK KHK 
VMNE I JCKENGE VKI LLKSG KEKPKTNI EDLQI KKVKKKKKKKHK 
BNEKRKRPKMYSKSIQTICSGLLTDVEDQAAKGILNDNIKDYVG 
KNLDTKN YDS Kl PENS B F P FVSLKEPRVQNNLKRLDTIiBFKQLI 
HIEHQ PWGGAS VTKCLQ 


6520 


3 


1113 


ERKMAEPPSPVHCVAAAAPTATVSEKEPFQKLQliSSRDPPGSLS 
AKKVRTEEKKAPRRVNGEGGSGGNSRQLQPPAAPSPQSYGSPAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLIi 
VPPTLLHAQPHHLLLPAAAAAASANAXSRRPKBKREKERRRHGL 
GGAREAGGASREENGEVKPliPRDKIKDKIKERDKEKEREXKKHK 
VMNEIKKENGEVKILLKSGKEKPKTNIEDLQIKKVTCKKKiOGCHK 
ENEKRKRPKMYSKSIQTICSGLI.TDVEDQAAKGILNDNIKDYVG 
KNLDTKNYDSKI PEN3EFP FVS LKEPRVQNNLKRLDTLEFKQLI 
HIEHQPNGGASVIHCLQ 


6521 ' 
6522 


184 
1042 


1798 - " 

391 1 


KLFKKATDTSQGELVHPKALPLIVGAQLIHADKLGEKV2DSTMP 
IRRTVNSTRETPPKSKLAEGEEEKPEPDISSEESVSTVEEQENE 
TPPATSSEABQPKGEPENEEKEENKSSEETXKDEKDQSKEKEKK 
VTCXTIPSWATLSASOLARAQKQTPMASSPRPKMDAILTEAIXAC 
FQKSGAS WAIRKY I IHKYPSLELERRGYLLKQALKRELNRGVI 
KQVKGKGASGSFVWQKSRKTPQKSRNRKNRSSAVDPEPQVKLE 
DVLPLAFTRLCEPKEASYSLIRXYVSQYYPKLRVDIRPQLJjKNA 
LQRAVERGQLEQITGKGASGTFQLKKSGEKPLLGGSLMEYAILS 

aiaamnbpktcsttalkkyvlenhpgtnsnyqmhllkktlqkcb 
kngkmeqisgkgfsgtfqlcfpyypspgvlfpkkepddsrdede 
dedess eeds edeep p pkrrlq kkt paks pgkaasvkqrgs kpa 

pkvsaaqrgkarplpkkappkaktpakktrpsstvikkpsggss 
kkpatsarke 

^kwlrpsprshrtpbsgrvlslfrlpppgmalsgstpapcweed 
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SEQ 
ID 
NO: 


Predlcced 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, CoCysteine, D«Aspartic Acid, B« 
Glutamic Acid, F»Phenylalanine, G*Glycine, 
H=Histidine, lolsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Sexine, TsThxeonine, VaValine, 
W=Tryptophan, Y=Tyrosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








ECLDYYGMLSLHRMPEWGGQLTECELBIiLAFLLDEAPGAAGGL 
SRARSGLKLLbEIiERRGQCDESl^RIiLGQLLRVLARHDLLPHIiA 
RKRRRPVSPERYSYGTSSSSKRTEGSCRRRRQSSSSANSQQGSP 
PTKRQRRS RGRPSGGARRR RRG PQ PHPSS S QS PPDL PLICA K 


6523 


2 


1097 


ASCQTRRRTAALDSGERi AGRRS P I ALAMASNFNDI VKQG YVKI 
RSRKLG I FRRCWLVFKKASS KG PRRLBKFPDEKAA YFRNFHKVT 
ELHNI KNI TRLPRETKKHAVAI I FHDETSKTFACESELEAEEWC 
KHLCtfECLGTRLND I S I/5EPDLLAAGVQREQNERFNVYLMPTPN 
IiD I YGECTMQ ITHEN I YLWD IHNAKVKLVMWP LSSLRRYGRDS T 
WFTFESGRMCDTGEGLFTFQTREGEMIYQKVHSATLAIAEQHER 
LMLE WE QKARLQTSLTEPMTLS KS I SL PRS AYWHHI TRQNS VGE 
I YSLQGNHENRHSDLTGKSCKTS ENRFLEENAPLVMYG IT/HHLF 
MDTSTCKWKDLE 


6524 


2 


1097 


ASCQTRRRTAAIiDSGERIAGRRSPIAIiAMASNFNDIVKQGYVKI 
RSRKLGIPRRCWLVFKKASSKGPRRIiEKFPDEKAA YFRNFHKVT 
ELHN IKNITRLPRBTKKHAVAI I FHDETSKTFACESELEAEEWC 
KHLCMECLGTRLNDISLGEPDLLAAGVQREQNERFNVYLMPTPN 
LDIYGECTMQITHBNIYLWDlHNAKVXLVMWPLSSIiRRYGRDST 
WFTFESGRMCiyKMGLFTP^REGEMIYQKVHSATLAIAEQHER 
LMLEMBQKARLQTSLTEPMTLSKS1SLPRSAYWHHITRQNSVGB 
IYSI£GNHBNRHSDLTGKSCKTSENRFLEENAPLVMYGITTfflLF 
MDTSTCKWHDLB 


6525 


1 


1859 


GES PFSEBES I EFNPS SSGRSART VSSNS FCSDDTGWPS S QS VS " 
PVKTPSDAGNSPIGFCPGSDEGFTRKKCTIGMVGEGSIQSSRYK 
KESKSGLVKPGS EADFS 5S3S TGS I SAP EVHMS TAGS KRS S S SR 
NRG PHGRSNGAS SHKPGS S PS SPREKDLLSMLCRNQLSP VNIHP 
SYAPSSPSSSNSGSYKGSDCSPIMRRSGRYMSCGENHGVRPPNP 
EQYLTPI^XiKEVTVRHLKTKLKESERRLHBRESEIVELKSQLAR 
MREDW I jtsajs UHKVEAQ LAIiKEAR KE I KQLKQVI ETMRS S LADKD 
KG I QKYFVD INI QNKKLES LLQSMEMAHSGSLRDELCLDF PCDS 
PEKSLTI^PPLDTMADGIiSLBEQVTGEGArRELLVGDSIANSTD 
LFDE I VTATTTESGDLE LVHST PGANVLELLP I VMGQEEG S VW 
ERAVQTDWPYSPAISELI QSVLQKLQDPCPSS LAS PDESEPDS 
MESFPESLSALWDLTPRNPNSAILLS PVETPYANVDAEVHANR 
I^RELDFAACVTIERLTCVIPLARGGVVRQYWSSSFLVDLLAVAA 
PWPTVLWAFSTQRGGTDPVYNIGALI^GCO/VALHSLRRTAFR 
IKT 


6526 
■ 


2 


2034 


SGRAGEPEEWRGRQIIDSKETWIPFNSEDSQQIiEEAYSSGKGCN 
GRVVTTDGGRYDVHLGERMRYAVYWDELASEVRRCTWFYKGDKD 
NKYVP YSES FS QVLEBT YMLAVTLDEWKKKLES PNRE 1 1 I LHNP 
KLMVHYQPVAGSDDWGSTPMEQGRPRTVKRGVENISVDIHCGEP 
LQIDHLVFWHGIGPACDLRFRS IVQCVNDFRS VSLNLLQTHFK 
KAQENQO I GRVE FLPVNWHS PLHSTGVDVDLQR I TLPSINRLRH 
FTNDTILDVFFYNSPTYCO^IVDTVASEMNRIYTLFLQRNPDFK 
GGVSIAGHSLGSLILFDILTNQKDSLGDIDSEKGSLNIVMDQGD 
TPTLEBDLKKJjQLSEFFDI FEKEKVDKEALALCTDRDLQEIGIP 
LGPRKKI LN YF S TRKNS MG I KRPAPQ PASGAN I P KESEFCS S SN 
TRNGD YLD VG I GQVSVKYPRLI YKPE I FEAFGS P I GM FLTVRGL 
KRIDPNYRFPTCKGFFNIYHPFDPVAYRIEPMVVPGVEFEPMLI 
PHHKGRIOlMHLBI#REGLTPJ1SMDLKNNLLGSIiRMAWKSFTRAPY 
PALQASBTPEETEAEPESTSEKPSDVNTEETSVAVKEEVLPINV 
GMLNGGQRIDYVLQEKPIES FNEYLFALQSHLCYWESEDTVLLV 
LKEIYQTQG1FLDQPLQ 


6527 


1 


922 


GW V PLLS RILPSDACKI YKQ G INI RLDTTL I DFTDM KCQRGDLS 
F I FNGDAAPSES FWLDNBQKVYQRIHHEES EMETEEE VD I LMS 
SDIYSATLSTKSISFTRAQTGVfLFREDKTERVGNFIADFYIiVNG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C*Cysteine, D«Aspartic Acid, E- 
vj-iuuciuuc Acia, ^"x'nenyiaianine, G— Glycine, 
H=Histidine, I»Isoleucine, K~Lysine, 
L» Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
Ss=Serine, To Threonine, V=Valine, 
WeTryptophan, YsTyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








L VLES RKRREHL S EED I LRNKAIMES LS KGGN I MEQNFEP I RRQ 

SLTPPPQNTI tw se yi s aengkaph lgrelvcke s kkt pkati a 

MSQEFPLGIELLLNVLBVVAPFKHFNKIjREFVQMKCjPPGFPVICL 
DIPVFPTITATVTFQEFRYD3FDGSIFTIPDDYKEDPSRFPDL 


6528 


1 


1073 


IjTG paaae prcaadagm kralgrrkg vwlrlr KI L FCVLG t»YIA 
IPFLI KLCPG I QAKLI PLNFVRVP YFIDLKKPQDQGLNHTCNYY 

lqpeed\rrigvwhtvpavwwxnaqgkdqmwyedalasshp i ily 
lhgnagtrggdhrvelykvlsslgyhwtfdyrgwgdsvgtpse 
rgmtydalhvfdw i karsgdnp vyi wghs lgtgvatnlvrrlce 

RETPPDALILESPFTNIREEAKSHPFSVTYRYFPGFDWFFLDPI 
TSSGIKFANDENVKHISCPLLILHAEDDPVVPFQLGRKLYSIAA 
PARS FRDFKVQ FVPFH S DLG YRH KY I YKS PELPR I LREFLGKS E 
PBHQH 


£iS29 


363 


2215 


THIRYNKIGWKTMSCGNEFVETLKKtGYPKADNLNGEDFDWLF " 
BGVBDES FLXWFCGNVNBQNVLSEREIjEAFS I LQKSGKPIIjEGA 
ALD E ALKTC KTS DLKT PR LDD KELE KLEDE VQTLLKLKNLK I QR 
RNKCQLMAS VTSH KSLRLNAKEEEATKKLKQSQG I LNAMITKI5 
NELQAIjTDEVTQLMMFFRHSNLGQGTNPLVFLSQ fslekylsqe 
EQSTAALTLYTKKQFFOGIHEWESSNESQFFNFLKIQTPS I CD 
NQEILEERRLEMARLQLAYICAQHQLIHLKASNSSMKSSIKWAE 
ESUISLTSKAVDKENLDMISSLTSEIMKLEKEVTQIKDRSLPA 
WRENAQLL^PVVKGDFDLQITiKQDYYTARQELVLNQLIKQKA 
SFELLQLSYBIELRKHRDIYRQLENLVQELSQSNMMLYKQLEML 
TDPSVSQQlNPRNTIDTKDYSrHRLYQVLEGENKKKELFLTHGN 
LEE VAE3CLKQNI S LVQDQ LAVS AQEHS FFLS KRNKD VDMLCDTL 
YQGGNQLLLSDQELTEQFHKVESQLNKLNHLLTDILADVKTICRK 
TLANNKLHQMERE FYVYFLKDEDYLKD I VENLETQS KI KAVSLE 
D 


6530 1 


128 


298* 

- 


GAAHHGAI VQ VHP LLPGS STI M IHDLCLVF PAPAKAWYVS D I Q "" 
ELYIRVVDKVEIGKTVKAYVRVLDLHKKPPLAKYFPFMDLKLRA 
ASPIITLVALDEAIiDNYTITFLXRGVAIGQTSLTASVTNKAGQR 
INS APQQ IEVF P P FRLMPRKVTLL IGATMQVTSEGGPQPQSNIL 
FS I SNES VALVS AAGLVQGLAI GNGTVS GLVQAVDAETGKW 1 1 
SQDLVQVEVLLLRAVRIRAP IMRMRTGTQMPIYVTGITNHQNPF 
S FGNAVPGLTFHWS VTKRDVLDLRGRHHEAS IRLPSQYNFAMNV 
LGRVKGRTGLRAWKAVDPTSGQLYGLARELSDE I Q VQVFE KLQ 
LLNPB I EAEQI LMS PNS Y I KLQTNRDGAASLS YRVLDGPEKVP V 
VHVDEKGFLASGSMIGTSTI EVIAQEPFGANQTI IVAVKVS PVS 
YLRVSMS PVLHTQIfiCEALVAVPLG^VTFTVHFHDNSGDVFHAH 
SS VLNFATNRDD FVQ IGKGPTNNTCVVRTVS VGLTLLR VWDAKH 
PGLSDFMPLPVLQAI S PELS GAMVVGDVLCLATVLTSLEGLSGT 
WSSSANS I LHIDPKTGVAVARAVGS VTVYYEVAGHLRTYKEVW 
S VPQR I MARHLH P I QTS FQEATAS KV I VAVGDRS SNLRGECTPT 
Uru&v^uiuinraiuisuuaurKrAvc UrroUUVr i VLPQFDTALG 
QYFCS ITMHRLTDKQRKHLSMKKTALWS ASLS SSHFS TEQVGA 
EVPFS PG LFADQ AE I LLSNH YTSSE I RVFGAPEVLENLE VKSG3 
PAVLAFAKEKSrcwPSFITYTVGVLDPAAGSQGPLSTTLTFSSP 
VTNQAIAI PVTVAFWDRRGPGPYGASLFQHFLDS YQVMFFTLF 
ALLAGTAVMI IAYHTVCTPRDLAVPAALTPRASPGHSPHYFAAS 
SPTSPNALPPARKASPP5GLWSPAYASH 


6531- ■ 


84* 


1425 


PSASIPPSASPDPVPDIRTCHFCLVEDPSVGCISGSBKCTISSS 
SLCMV ITI YYDVKVRF-IVRG CGQY I S YRCQEKRNTYFAEYWYQA 
QCCQYDYCNSWSS PQLQSSLPEPHDRPLALPLSDSQI QWFYQAL 
NLSLPLPNFHAQTEPDGLDPMVTLS LNLGLS FAELRRM YLFLNS 
SGLLVLPQAGLLTPHPS 


6532 


2 


9S4 


AAGPPSEWNQDSLFPEPEPGPAPQVLLGPQGPGLIlfGVAfe>PTL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid ' 
sequence 


Predicted end 
nucleotide 
location 
corr e spon ding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=»Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I*Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, 0=Glutamine, R=Arginine, 
SoSerine, T=Threonine, Va Valine, 
{^Tryptophan, Y=Tyrosine, X«Unknown, *«Stop 
Codon f /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








I TDSTGTHLVLTVTNKNAH3 PGLS RGS PQQPS SQPGSPAPAPSA 
QMDLEHPIiQPLFGTPTSLLKKEPPGYEEAMSQQPKQQENGSSSQ 
QMDDLFDILIQSGEISADFIG2PPSLPGKEKPSPKTVCWSPIAAQ 
PSPSAELPQAAPPPPGSPSLPGRLEDFLESSTGLPLLTSGHDGP 
E PLS L I DDLHSQMLSS TAX LDHP P S PMDTSELHFVPEPSS TMGL 
DIJUXJHLDSMDWLELSSGGPVLSLAPLSTTAPSLPSTDFLDQHD 
LQLHWDSCL 


6533 


1798 


373 


STISWLARVEPPRRSSGVGAARLRFPGGSRPLRARACVLALAVL 
ALLBRWNADSMSAHSMLCERXAIAKEIjIKRAESLSRSRKGGIEG 
GAKLCSKLKAELKFLQKVBAGKVAI KESHLQSTNLTHLRAIVES 
AENLEEWSVLHVFGYTDTLGEKQTLVVOWANGGHTWVKAIGR 
KABALHNIWIX3RGQYGDKSI IEQAEDFLQASNQQPVQ YSNPHI I 
FAFYNS VSSPMAEKLKEMG I S VRGDIVAVNALLDHPEELQ PSES 
ESDDEG PELLQVTRVDREN I LAS VAF PTE I KVDVCKRVNLDI TT 
LITYVSALSYQGCHFIFKEKTLTEQAEQERKEQVLPQliEAFMKD 
KELFACESAVXDFQ5 ILDTLGGPGBRERATVLI XR INWPjDQPS 
ERALRLVASSKINSRSLTI FGTGDTLKAITMTANSGFVRAANNQ 
GVXFSVFIHQ P R ALT ESKEALATPL PKDYTTDSEH 


6534 


47 




KATRFXS AAFWIiNKQGVS PAKLPHTS WS WSLQTLS FLFS G DLA 
EKSLQCFPCSAMIiLELIPLLGIHFVLRTARAQSVTQPDIHITVS 
EGASLELRCNYSYGATPYTjFWMERTVEEAPILLVCLKPWRVASS 
LEKKEKEDESFQIOiWSRYNVLKAHCLIaPLIRWLTSGDSLliSAQ 
PHCPQGL 


6535 


250 


964 


D I KT F FRDV AI QRD b LPKE KN LETLLTLA FL E IDKAFS SHARLS " 
ADATLLTSGTTATVALLRDG IEL WAS VGDSRAILCRKGKPMKL 
TIDHTPERKDEKERIKKOSGPVAWMQT^nDHUwri'DT.awrTO o rnr\ 
LDLKTSGVI ABPETJCRIKLHHADDS FL VLTTDG INFM VNSQE I W 
DFVNQCHDPNEAAHAVTEQAI QYGTEDN3TAVVVP FGAWGKYKN 
SEINFSFSRSFASSGRWA 


6536 


242 


1174 


SLVKEMTNQYGILFKQEQAHDDAI WS VAWGTNKKENS ETWTGS " 
LDDLVKVWKWRDERLDLQWSLEGHCJLGVVS VDI S HTL PI AAS S S 
LDAH IRL WDLENGKQ IKS I DAGPVDAWTLAFS PDSQ YLATGTHV 
GKVNIFGVESGKKEYSLDTRGKFILSIAYSPDGKYLASGAIDGI 
INI FDIATGKLLHTLEGHAMP IRSLTFS PDSQLLVTASDDG YIK 
lYDVQHANLAGTLSGHASVn^NVAFCPDDTHFVSSSSDKSVKVW 
DVGTRTCVHTFFDHQDQVWGVKYNGNGSKIVSVGDDQEIHIYDC 
PI 


6537 


1638 


921 


WRFNPPPTQGPDPSLVYRPDVDPEVAKDKASFRNYTSGPLIiDRV 

FTTYKLMHTHQTVDFVRSKHAQFGGFSYKKMTVMEAV^ 

DESDPDVDFPNSFHAFQTAEGIRKAHPDKDWFHLVGLLHDLGKV 

LALFGEPQWAWGDTFPVQCRPQASWFCDSTFQDNPDLQDPRY 

ST1MMYQPHCGLDRVLMSWGHDGEARGGQWGGGGRWGTVGGGG 

AE AVPAGDTLS PQSTCTR 


6538 " 


3345 


2412 


P YLYDFLDAL ITCQTAPEEAF I KLDGLAGMLTEQLRRLTKQVQE 
ARHNRDDEAIWCAVNEYDETMEKYIPVLMAOAKIYWNLENYPMV 
EKIFRKSVEFCNDHDVWKLNVAHVLFMQBNKYKEAIGFYEPIVK 
KHYDNILNVSAIVLAWLCVSYIMTSQNEKABELMRKIEKEEEQL 
SYDDPNRKMYHLCIVNLVIGTLYCAKGNYEFGISRVIKSLEPYN 
KKLGTDTWY YAKRCFLSLLENMS XHMI VIHDS VI QECVQ FLGHC 
BLYGTN I PAVI EQ PLE EERMHVGICNTVTDESRQLKAL I YE I IGW 
NK 


6539 " 


218 


339 


FLGAASPHPHFSSIiAPHPDOPEFTPVQDELEAMELWGPGV 


6540 " 


3 


391 " " 


LERLWI^I^J^PEDAMAECPTLGEAVTDHPDRLWAWEKFVYLDE 
KQHAWLPLTIEIKDRLQIiRVLLRREDWLGRPMTPTQIGPSLLP 
I MWQL YPDGR YRSSDS S FWRLVYH I KIDGVKDM LLE LLP DD 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corr e spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C-Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanina, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SoSerine, T=Threonine, V= Valine, 
Wn Tryptophan, YoTyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\apoasible nucleotide insertion) j 


6541 


1165 


536 


RTIiVQRRILMEjLRKPARGRDLRGRGRGTPRGGRKGLLPTPDEFP 
R FEGGRKPDS WD GNREPGPGHEH FRDTPR PDHPPHDGHS PAS RE 
RSSS LQGMDMASLP PRKRPWHDGPGTSHHREMEAPGGPSEDRGG 
KGRGGPGPAQRVPKSGRSSSLDGEHHDGYHRDEPFGGPPGSGTP 
SRGGRSGSNWGRGSHMNSGPPRRGASRGGGRGR 




3 


377fe 


SWPRGRGETGGHPGALRTRTMQKSVRYNEGHAIiYIiAFLARKEGT 
KRGFLS KKTAfiAS R WHEKW FAL YQNVLFY FEGEQS CR PAGM YLL 
EG CS CERT P AP PRAGAGQGGVRDALDKQYYFTVLFGHEGQKPLE 
LRCEEEQDGKEWMEAIHOASYADILIERBVLMQKYIHLVQIVBT 
EKIAANQLRHQLEDQDTEIERLKSEIIALNKTKERMRPYQSNQE 
DEDPDIKKIKKVQSFMRGWLCRRKWKTIVQDYICSPHAESMRKR 
NQIVPTMVEAESEYVHQLYILVNGFLRPIiRMAASSKKPPISHDD 
VS S X FLNSET IMFLHE I FHQGLKARI ANWPTLI LADLFD I LLPM 
LNI YQE FVRNHQ YS LQ VLANCKQNRDFDKLL KQ YEAN P ACEGKM 
LETFLTYPMFQI PRYI ITLHEI»LAHTPHEHVERKSLEFAKSKLE 
ELSRVMHDEVSDTBNIRKNLAIERMI VEGCD I LLDTSQTFIRQG 
SLIQVPSVERGKLSKVRLGSLSLKKEGERQCFLFTKHFLI CTRS 
SGGKljniKTGGVLSLIDCTLIEEPDASDDDSKGSGQVFGHXiDF 
KIVVEPPDRAAFTVVLLAPSRQEKAAWMSDI SQCVDNIRCNGliM 
TIVFEENS KVTVPHMIKSDARLHKDDTDICFSKTLNSCKVPQIR 
YASVERLIiERliTDLRFLS IDPLNTFLHTYRIFTTAAVVLGKLSD 
I Y KR PPTS I P VRSLEL FFATSQNNRGEHLVDG KS PRJjCRKFS SP 
PPLAVSRTSS PVRARKI1SI1T 5 ; PT»N«HCTRAT,nT.TTQ Q Q DTTTTnc 

PAASPPPHTGQIPLDLSRGLSSPEQSPGTVEENVDNPRVDLCNK 
LKRS IQKAVLES APADRAGVESSPAADTTEI#SPCRSPSTPRHItR 
YRQPGGQTADNAHCSVSPASAFAIATAAAGHGS PPGFNNTERTC 
DKEFI I PJ?TAT^VLtT\TLRH WSKHAQDFELNNELKMNVLNIjLE 
EVLRDPDLLPQERKAAANILMALSQDDQDDIHLKLEDIIQMTDC 
MKAECF^LSAMEIiAEQITIjIjDHVIFRSIPYEEFIiGQGWMKIjDK 
^RTPYIMKTSOHFtTOMSNLVASOTKMYAnvsSRT^MaTKTCWVAV 
ADICRCLHNYNGVLE I TS ALNRSAl" YRLKKTWAKVS KQTKAXiMD 
KLQKTVSSEGRFKNLRE^LKNCNPPAVPYLGMYLTDLAFIEEGT 
PNFTEEGLVNFS KMRMISH I IREIRQFQQTS YRI DHQPKVAQYL 
LDKDLI IDEDTLYELSLKIEPRIiPA 


" 6543 


1857 


950 


FVSGCGRAGIGLSWAMAAEARVSRWYFGGLASCGAAOCTHPIjDL 
LKVHLQTQQEVKLRMTGMALRVVRTDG I LAL YSGLS AS LCRQMT 
YSLTR FAI YETVRDRVAKGS QGPLPFHEKVLLGS VS GLAGG F VG 
TPADLVNVRMQNDVKLPQGORRNYAHALDGLYRVAREEGLRRLF 
SGATMASSRGALVTVGQLS CYDQAKQL VLSTG YLSDNI FTHFVA 
SFIAGGCATFLCXJPLDVLKTRLMNSKGEYQGVFHCAVETAKLGP 
LAFYKGLVPAGIRLI PHTVLTFVFLEQLRKNFGI KVPS 


6544 


630 


79 


PSPCFIRSRLDGQPWMAGLEAWLSQNFSLHQPQSRVRVRRASIS 
EPSDTDPEPRTLNPSPAGWFVQQHPBLBLMSSFRERFGRNWLQY 
RSHLEPSGKPIiPATPTTSAPSAPPASSQGPDTAPRPSPPQEEAR 
GPQES PQKMSEEVRAEPQEEEEEfQ*X3KEEKEEGEMAPLFEAHLG 
EGKQKECP 


6545 


176 


560 


P PHSHAALLPAAMTPLLT L I LiWtiMGL PLAQALDCHVCAYNGDN 
CF^MRCPAMVAYCMTTRTYYTPTRMICVSKSCVPRCFETVYDGY 
SKHASTTS CCQ YDLCNGTGLATPATIiALAPILLATLWGLL 


*546 


1657 


364 


HLLNGLD E VAA FF VADLGAIVRKH FCFLKCLP RVRP FYAVKCNS 
SPGVLKVLAQI/5LGFSCANKAEMEIjVQHIGIPASKI ICANPCXQ 
IAQ I KYAAKHG IQLLS FDNEM EIAKWKSHPS AKMVLCIATDDS 
HSLSCLSLKFGVSLKSCRHIJiENAKKHHVEVVGVSFHIGSGCPD 
PQAYAQSIADARLVFEMGTELGHKMHVLDtiGGGFPGTEGAKVRF 
EE I AS V INS AliDLYF PEGCGVD I FAELGR YYVTS AFTVAVS 1 1 A 
KKE VLLDQ PGREEENGSTS KT 1 VYHLDEG VYG I FNS VL FDUI CP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A«Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F«Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Lsucine, M»Methionine, N=Asparagine, 
PsProline, G=Glutamine, R=Arginine, 
S=Serine, T=Threonine , v-Valine, 
WoTryptophan, Y-Tyrooine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\apossible nucleotide insertion) 








tpilqkkpsteqplyssslWgpavdgcdcvaeglwlpqlh\/gdw 
l vfdnmgayt vgmgs pfwgtq ach i t yams rvawealrrqlmaa 
eqeddvegvckpls cgweitdtlcvgpvftpasim 


6547 


1 


541 


LHSKYLAPALCSQPGMMRCCRRRCCCEQPPHALRPLLLLPLVLL 
PP1AAAAAGPNRCDTIYQGPAECLIRLGDSMGRGGELETICRSW 
NDFHACASQVLSGCPEEAAAWJES LQQEARQAPRPNNLHTLCGA 

P VHVRERGTGS ETNQETLRATAPALPMAPAPPLLAAAIiALiAYLL 
RPLA 


6548 


2 


219 


FVSRLSVRDVRFPTFLGGHGADAMHTDPDYSAAYVPIETDAEDG 
IKGCGITFTLGKGTEVGELKILSRFQNA 


S549 


73 


1490 


ETGRVCEDARPACGSRSRRRRKEAAPGIPTPSPSSSSPTSSRPA 
ARAFSKAPARLSRPRAREEPPDPGRRYIQBEIIQARKHKLIKMC 
S S VAAKL WFLTDRRIRBD YPQ KE I LRAL KAKCCE EEL DF RAWM 
DEVVLTIEQGNLGIiRINGELITAYPQVWVRVPTPWVQSDSDIT 
VLRHIjEKMGCRLMNRPQAILNCVNKFWTFQELAGHGVPLPDTFS 
YGGH2NFAKMIBEAEVLEFPMWKNTRGHRGKAVFIiARDKHHLA 
DLSHLIRHEAPYLFQKYVKESHGRDVRVIWGGRWGTMLRCST 
DGRMQSN CS LGG VGMMCSLSEQGKQLAI QVSNI LGMDVCG I DLL 
MKDDGSFCVCEANANVGFIAFDKACNLDVAGI IADYAASLLPSG 
RLTRRMSLLSWSTASBTSEPELGPPASTAVDNMSASSSSVDSD 
PBSTERELLTKLPGGLFNMNQLLANE IKLLVD 


6550 


2293 


922 


FRVSRDGAPDOGIEQMGIAMEHGGSYARAGGSSRGCWYYLRYffF 
LFVSLI QFL 1 1 LGLVLFMVYGNVH VSTESNLQATERRAEGL YSQ 
LLGLTASQSNLTKELNFTTRAKDAIMQMWLNARRDLDR I NASFR 
QCQGDRVI YTNNQRYMAAI I LSEKQCRDQFKDMNKS CDALL FML 
NQKVKTLEVBIAKEKT Z CTKDKES VLLNKRVAEEQLVECVKTRE 
I^HQERQLAKEQLQKVQALCLPLDKDKFEMDLRNLWRDS 1 1 PRS 
LDNLGYNLYHPLGSELASIRRACDHMPSLMSSKYEELARSLRAD 
IERVARENS DLQRQKLEAQQGLRAS QEAKQKVE K EAQAREAKLQ 
AECSRQTQLALEEKAVLRKERDNLAKELEEKXREAEQLRMELAI 
RNSALDTCIKTKSQPMMPVSRPMGPVPNPQPIDPASL»EEFKRKI 
LESQRPPAG1PVAPSSG 


6551 


157 


748 


IQPPDPRNMTLAAYKE KMKELPLVS LFCSCFI1ADPI1NKS SYKYE 
ADTVDLNWCVI SDMBVIELNKCTSGQS FE VI LKP PS FDG VPE FN 
ASLPRRRDPSLEEIQKKIiEAABBRRiCYOEAEZiLKHZAEKREHER 
BVIQKAIEJB^FIKMAKEKLAQKhESNKENREAH 
BKDKHAEEVRKNKELKEEASR 


6552 


157 


748 


IQPPbPRNMTIJUlYKEKMKELPLVSLFCSCFIJuOPUJKSSYKYE 
ADTVDLNWCVI SDMEVIBLNKCTSGQ S FEVILKP PS FDGVPE FN 
ASLPRRRDPSLEEIQKKLEAAEERRKYQBAELLKHLAEKREHER 
EVIQKAIEENNNFI KMAKEKLAQKMES NKENREAHLAAMLE RLQ 
EKDKHAEEVRKNKELKEEASR 


6553 

» 


2 


1807 


FVWS KMAAHLS YGR VNLNVLR EAVRRELREFLDkCAGS KA I VWD 
BYLTGPFGLIAQYSLLKEHEVEKMFTLKGNRLPAADVKNII FFV 
RPRLBLMD I IAENVLSEDRRGP TRDFH I L FVPRRS LLCEQRL KD 
LGVLGSFIHREBYSLDLIPFDGDLLSMESEGAFKECYliEGDQTS 
LYHAAKGLMTLQALYGTIPQI FGKGECARQVANMMIRMKREFTG 
SQNS I FPVFDNLLLLDRNVDLLTPLATQLTYEGLIDE I YGIQNS 
YVKLPPEKFAPKKQGDGGKDLPTEAKKLQLNSAEBLYAEIRDKN 
FNAVGSVLSKKAKI ISAAFEERHNAKTVGEIKQFVSQLPHMQAA 
RGSLANHTSIAELIKDVTTSEDFFDKLTVEQEFMSGIDTDKVNIJ 
Y I EDC I AQ KHSL IKVLRLVCLQSVCNSGLKQ KVLD Y YKRE I LQT 
YG YEH I LTIjHNLEKAGIiKPQTGGRNNYPTIRKTIjRL WMDDVNE 
QNPTDI S YVYSGYAPLSVRLAQLLSRPGWRS IEEVLRILPGPHF 
EERQPLPTGLQKKRQPGENRVTLIFFLGGVTFAEIAALRFLSQL 
EDGGTEYVIATTKLMNGTSWIEALMEKPF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, 0=»Aspartic Acid, E= 
Glutamic Acid, F-Phenyl alanine, G=Glycine, 
H=Histidine, I*Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PaProline, Q=Glutamine, RoArginine, 
SaSerine, T=Threonine, VoValine, 
W=Tryptophan, Y»Tyrosine, XoUnknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


6554 


119 


1244 


FEMGSQVSVBSGALHWIVGGGFGGIAAASQLQALNVPFMLVDM 
KDSFHHNVAALRASVETGFAKKTFISYSVTFKDNFRQGLVVGID 
LKNQMVLLQGGEALPFSHLILATGSTGPFPGKFNEVSSQQAAIQ 
AYEDM VRQVQi^RFI VVVGGGSAGVEMAAEI KTEYPBKEVTLIH 
SQVALADKELLPSVRQEVKEILLRKGVQLLLSERVSNLBELPLN 
EYREY IKVQTDKGTE VATNLVILCTGI KINS S AYRKAFES RLAS 
SGALRVNEHLQVEGHSNVYAIGDCADVRTPKMAYLAGLHANIAV 
ANIVNSVKQRPLQAYKPGALTFLLSMGRNDGVGQISGFYVGRLM 
VRLTKSRDL FVS TS WKTMRQS PP 


6555 


1552 


498 


IHMALLRKI*IQVXLFLI,IVTtt:VliY^ 

TPEELEEEIPWICAAAGRMGATMAAINSIYSNTDANILFYWG 
LRNTLTRIRKWI EHSKLREIfiTFKI VEFNPMGLKGKIRPDS SRPE 
LLQPLNFVRFYLPLLIHQHEKVIYLDDDVIVQGDIQELYDTTLA 
LGHAAAFSDDCDLPSAQDINRLVGIiQNTYMGYLDYRKKAI KDLG 
ISPSTCSFNPGVIVANMTEWKHQRITKQLEKWMQKNVEENLYSS 
SLGGG VATS PML I VFHGKYS TINPLWH I RHLGW N Pn»R y q pit jtt 

OEAIOiIiK^NGRHKPWDFPSVWDLWESWFVPDPAGIFKLNHHS 


6556 


241 


1449 


ASLCKGCFFVTHVLVIILPSLQSPPTFGFLLDIDGVLVRGHRVI 
PAAL KAFRRL VlSIS QGQIJRVP VVTVTENAGNILQHSKAQEIjSALLG 
CEVBADQVTLSHS PMKLFS E YHEKRMLVSGQGPVMENAQGLG FR 
NWTVDELRMAFPLLDMVDLERRLKTTPIiPRNDFPRIEGVLLLG 
EPVllWBTSLQLIMDVLLSNGSPGAGLATPPYPHLPVIiASNMDLL 
WMAEAKMPRFGKGTFLLCLETIYQKVTG KELR YEGLMG K PS I LT 
YOYAEDIiI RRQA E RRGWAAP I RKL YAVGDNPMS DVYGANLFHQ Y 
LQKATHDGAPELGAGGTRQQQPSASOSCISILVCTGWMPPWDn 
STE PVLGGGEP P FHGHRDLCFS PGLMEASHWNDVNEAVQLVFR 
KEG WALE 


6557 


2598 


1534 

i 


RMCGRTSCHLPRDVLTRACAYQDRRGQQRLPEWRDPDKYCPSYN " 

KSPQSNSPVLLSRIjHFEKDADSSERIIAPMRWGLVPSWFKESDP 

SKLQFNTTNCRSDTVMEKRS FKVPLGKGRRCWLADGFYEWQRC 

QGTNQRQPYFIYFPQIirrEKSGSIGAADSPENWEKVWDNWRLLT 

MAGIFDCWEPPEGGDVLYS YTI ITVDSCKGLSDIHHRMPAILDG 

EEAVSKWLDFGBVSTQEALKLIHPTENITFHAVSSWNNSRNNT 

PECXAPVDLWKKELRASGSSQRMI^WLATKSPKKEDSKTPQKE 

BSDVPQWSSQFLQKSPLPTKRGTAGIiLEQWIjKRBKEEEPVAKRP 

YSQ 




21 


1138 


FHGRRRGGRKMELGS CLEGGREAAEEEGBPEVKKRRLLCVEFAS 
VAS CDAAVAQ CFLAEND WE ME RALNS YFE PPVEESALERRPBTI 
SEPKTYVDLTNEETTDSTTSKISPSEDTQQENPSMFSLITWWID 
GLDLNNLSERARGVCSYIJu^YSPDvTFLQEVIPPYYSYLKKRSS 
NYEI I TGHEEG YFTA IMLKKSRVKLKS QE 1 1 PFPS TKMMRNLLC 
VHVWSGNELCLMTSHLESTRGHAAERMNQLKMVIiKKMQEAPES 
AWIFAGDTNLRDREVTRraGLPNNIVTrVWEFLGKPKHCQYTWD 
TQMNSNLG I TAACKLR FDR I FFRAAAE EGH I I PRSLDLLGLEKL 
DCGRFPSDHWGLLCNLDIIL 


6559 


3 


364 


GPEIiSGLPTRPKKLKANQTPIAMDCGASRSCSVPTGPATTICSS" 
DKSCRCGVCLPSTCPHTVWLLEPTCCDNCPPPCHIPQPCVPTCF 
LLNSCQPTPGLETLNLTTFTQPCCEPCLPRGC 


6560 


3 


1435 


TATSGGIWLRRKWRCHWPRPLPQSCVGTEGGLQVRDTSSRIAKG 
GVDHTKMSliHGASGGHERSRDRRRSSDRSRDSSHERTESQLTPC 
IRNVTSPTRQHHVEREKDHSSSRPSSPRPQKASPNGS ISSAGNS 
SRNSSQSSSTCSCKTAGEMVFVYENAKEGARNIRTSERVTLIVD 
NTRFWDPS I FTAQPNTMLGRMFGSGREHNFTRPNEKGEYEVAE 
G 1 GS TVFRAI LD YY KTG I IRCPDG I S I PELREACDYLC I S FEYS 
TI KCRDLSAL^ELSNDGARRQFEFYI^EMILPLMVASAQSGER 
BCHrVVLTDDDVVDWDEEYPPQMGEEYSQIIYSTKLYRFFKYIE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
vn^iojiiiie, c-tysceine, D-Aspartac Acid, E= 
Glutamic Acid, Fa Phenyl alanine, G=Glycine, 
H»Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NoAsparagine, 
P=*Proline, Q*Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknovn, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








fliu/Tfuvo vuaokuxjaivxiuajXJSvX 1 1 1 K±»itvtvi\KPGGRPBvIYN 
YVQRPFIRMSWEKEEGKSRHVDFQCVKSKSITNLAAAAADIPQD 
QLWMHPTPQVDELDIIjPIHPPSGNSDLDPDAQNPMIj 


6561 




1086 


PGRRFRRKESSSSRWFPADCLLGLRGPASSLLdPEPSPSWPSHS 
PCPMAALTDLSPMYRWFKNCNLVGNLSEKYVFITGCDSGFGNLL 
A KQLVDRGMQVLAACFTEEGSQKLQRDTS YRLQTTLtiDVT KS B S 
IKAAAQVAOiDKVGEQGLWALVNNAGVGLPSGPNEVTLTKDDFVKV 
INVNLVGLI 2VTLHMIjPMVKRARGRWNMSSSGGRVAVIGGGYC 
VS KFGVEAFSDS I RRELYYFGVXVCI IEPGN YRTAILGKENLES 
RMRKLWERLPQETRDS YGEDYFRI YTDKLKNIMQVAR PRVRBVI 

NSMEHAIVSRSPRIRYNPGLDAKLLYIPLAKLPTPVTDFILSRY 
LPRPADSV 


6562 




T CO 


MSTLYDIRAHKAQLLRFFASSDSNKALEQRRTLHTPKLEHLDRV 
L YE WFLG KBS EGVPVSG PML I BXAKDFYEQMQLTB P CVFSGG WL 
WRFKARHGIKKLDASSEKQSADHQAAEQFCAFFRSLAAEHGLSA 
EQVYKADETGIiFWRCLPNPTPEGGAVPGPKOGKDRLTVLMCANA 
TGSHRLKPLAIGKCSGPRAFKGIQHLPVAYKAQGNAWVDKEIFS 
DW FHH I FVP S VREHFRT 1GLPEDS KAVLLLDS SRAH P QEAE LVS 
SNVFTI FL P AS VASLVQPMEQG IRRDFMRNF INP P VPLQGPHAR 
YNMNDAI FSVACAWNAVPSHVFRRAWRKLWPSVAFAEGSSSEEE 
LEAECFPVKPHNKSFAHILELVKEGSSCPGQLRQRQAASWGVAG 
REAEGGR P P AATS PAEWWSSEKTPKADQDGRGDPGEGEE VAWE 
QAAVAFDAVLRFAERQPCFSAQEVGQLRALRAVFRSQQQVRRRR 
GALGA WKVEALQEG PGG CGATAQS PLP CS S TAGDN 


6563 


1319 


2694 


LARPAQPVtLREPEGAGPPVPAGHLVHHLQGGHLRERAHPDilgA" 

HEHPLPCDQMFWRQMGGHLRMVEAWSRGWWG IGYDHTAWVYTG 

G YGGGCFQGLASS TSN I YTQSDVKCVHI YENQRWNP VTG YTSRG 

IiPTDRY^SD^GLQECTKAGTKPPSLQWAVTVSDWFYDFSVPGG 

TDQEGKQYASDFPASYHGSKIWKDFVRRRCWARKCKLVTSGPWL 

BVPPIALRBVSriPESPGAEGSGHSIALWAVSDKGDVLCRLGVS 

ELNPAGSSWIiHVGTDQP FA9CC S I GACYQVWAVARDGSAFYRGSV 

YPSQPAGDCWYHIPSPPRQRLKQVSAGQTSVYALDENGNLWYRQ 

GITPSYPQGSSWEHVSNNVCRVSVGPLDQVWVIANKVQGSHSLS 

RGTVCHRTGVQPHE PKGHG WD YGI GGG WDH I S VRANATRAPRSS 

SQEQEPSAPPEAHGPVCC 


6564 


1 


975" 


APGS CALWSYCGRGWSRAMRGCQX*I,GIiRSSWPGDLJjSARIjljSQE 
KRAAETHTOFETVSEEEKGGKVyQVFESVAKKYDVMNDMMSLGI 
HRVWKDLLLWKhlHPLPGTQLLiDVAGGTGD IAFRFIiNYVQS QHQR 
KQ KRQLRAQQNLS WEE LAKE YQNE EDSLGGS RVWCD INKEMLK 
VGKQ KALAQG YRAGLAWVLGDAEELP FDDDKFDI YTIA FG I RNV 
THIDQALQKAHRVLKPGQRFIjCIjEFSQVNNPLISRLYDLYSFQV 
I P VLG EVTAGDWKS YQYLVESIRRFP SQEEFKDMI EDAG FH KVT 
YESLTSGIVAIHSGFKL 


6565 ■ 


1464 


999 


RSAVANGLTKRRMGLiOxNGRYlSLI LAVQIAYLVQAVRAAGKCD 
AVFKGFSDCLLKLGDSMANYPOGLDDKTNIKTVCTYWEDFHSCT 
VTALTDCQEGAKDMWDKLRKE S KNLNIQGSL FELCGSGNGAAGS 
LLPAFPVLLVSLSAAIATWLS F 


6566 " 


3 


1385 


KYESAQPGGTQPEPGLGARMAIHKALVMCLGLPLFLFPGAWAQG 
HVPPGCSOGLNPLYYNLCDRSGAWGIVLEAVAGAGrVTTFVLTI 
I LVAS L P F VQDT KKRS LLGTQ V FFLIjGTLGLF CL VFACVE K P D F 
STCASRRFLFGVLFAI CFS CLAAHVFALNFLARKNHGPRGWVI F 
TVALLLTIiVBVI I WTEWL 1 1 TLVRGSGEGGPQGNSS AGWAVAS P 
CAIANMDFVMAL I YVMLJDIiLGAFLGAWPAIiCGR YIW WRJOTG VFV 
LLTTATS VAI WVVWI VM YTYGNKQHNS PTWDDPTLAI ALAANAW 
AFVLFYVI PE VSQVTKS SPEQS YQGDMYPTRGVGYETI LKEQKG 
QSMFVENKAFSMDEPVAAKRPVS P YS GYNGQLLTS VYQPTBMAL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C*Cysteine, D«Aspartic Acid, E« 
Glutamic Acid, Pa Phenylalanine, G=Glycine, 
H«Histidine, I»Isoleucine, Kt. Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline. O-Glutamine, R=Arginine, 
SnSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=»possible nucleotide insertion) 








MHKVPSEGAYDIILPRATANSQVMGSANSTLRAEDMYSAQSHQA 
ATP PKDQKNSQVPRNPY VWD 


6S67 


125 


863 


TKRSNLKAYACS I HH IRTMS YVF VM DSSO/TNVPLLQAC I DGDFN 
YSOLLE^GraPNIRDSRGRTGLKIiAAARGNVDICQLLHKFGAD 
LIATD YQGNTALHLCGHVDTIQFL VSNGLKI D I CNHQGAT PLVL 
AXRRGVNKDVI RLLE SLE EQ EVKG FNRGTHSKLETMQTAESESA 
MBSHS LLN PNLQQGEG VLS S FRTTWQE FVEDLGFWRVLLL I FVI 
ALLSLGIAYYVSGVLPFVENQPBLVH 


65G8 


3 


1183 


HASDRLLVLPDNYSHFSQASANLQGPSRTTELFHPTLAS tSSPM 
I*EGABL YFNVDHGYliEGLVRGCKASLLTQQD YIKLVQCE TLEDL 
KIHLQTTOYGNFLANHTNPLTVS KIDTEMRKRIiCG E FB Y FRNHS 
LEPLSTFLTYMTCS YM1 DNVI LLMNGALQKKS VKEILGKCHPLG 
RFTEMEAVNI AETPSDLFNAI LI ETPLAPFFQDCMSEKALDEIiN 
IELLRNKLYK5 YLEAFYKFCKNHGDVTAEVMCP ILEFEADRRAF 
IITLNSFGTBLSKEDRBTLYPTFGKLYPEGLRLLAQAEDFDQMK 
NVADH YG VYKP LF EAVGG S G G KT LE D V FYERE VQMNVLA FNRQF 
HYGVFYAYVKLKEQEIRNIVWIAECISQRHRTKINSY1PIL 


6569 


205 


1532 


RRRGPQRLGHGRPTPIiLCRWRTAGPSHWEKQARAFQGLRPVDPR 
RMSWLFPLTKSASSSAAGS PGGLTSLQQQKQRLIESLRNSHSSI 
AE IQKDVEYRL PFTINNLT INI NI LL PPQFPQBKPVI S VYP P I R 
HHLMDKQGVYVTS PLVNNFTMHSDLGKI IQSLLDEFWKNPPVLA 
PTSTAFPYLYSNPSGMSPYASQGFPFLPPYPPQBANRSITSLSV 
ADTVSSSTTSHTTAKPAAPS JGVLSNLPLPI PTVDAS IPTSQNG 
FGYKMPDVPDAFPELSELSVSQLTDMNEQEEVLLEQFLTLPQLK 
Q 1 1 TD KDD L VKS I E ELAR KNLLLE PS LKAKRQT VLD KYE LLTQM 
KSTFEKKMQRQHELSBSCSASALQARLKVAAHEAEEESDNIAED 
FLEGKMEIDDFLSSFMEKRTICHCRRAKEEKLQQAIAMHSQFHA 
PL 


6570 


330 


1304 


ARLPRLTFLREGFLYVLLSHWVFVGAPRPPASDSWKKGLVPSAP " 
PASRKM3 S KALPAP I P LHPSLQLTNYS FLQAVNTFPATVDHLQG 
LYGLSAVO/TMHMNHWTUSYPNVHEITRSTItEMAAAO^ 
PFPALPFTTHLFHPKQGAIAHVLPALHFU3RPRFDFANLAVAATQ 
BDPP KMGDLSKLS PGLGSPI SGLS KLTPDRKPSRGRLPSKTKKE 
FI CKFCGRH FTKS YNLL IHE RTHTDBRP YTCDI CHKAFRRQDHL 
RDHRYIHSKEKPFKCQECGKGFCQSRTLAVHKTLHMQTSSPTAA 
SSAAKCSG ETVI CGGT 


6571 


169 


656 


APDMNRKKIjQKLTDTLTKNCKHLFRGFDKDNDGCVNVLEWI hgl 
SLFLRGSLEEKMKYCFEVFDLNGDGFISKEEMFHMLKNSLLKQP 
S EEDPDEG I KDLVE I TLKKMDHDHDGKLS FADYELAVREETLLL 
EAFGPCLPDPKSQM EFEAQVFKDPNE FNDM 


6572 


49 


1644 


T P ERAQ P G ALLGAAGCCVCG GR W W PR S HERG YFS SAKMGS KRRN 
LSCSERHQKLVDBNYCKKLHVQALKNVNSQIRNQMVQN3NDNRV 
QRKQFLRLLQNEQ FELDMBEAI Q KAEENKRLKELQLKQEEKLAM 
ELAKLKHESLKDEKMRO^VRBNSIELRELEKKLKAAYMNKERAA 
Q IAEKDAIKYEQMKRDAEIAKTMMEEHKRI I KBBNAAEOKRNKA 
KAQYYLDLEKQLEEQEKKKQBAYEQLLKEKLMIDBIVRKIYEED 
OLE KQQKLE KMNAMRR Y I EE FQKEQALWR KKKREENEEENRXII 
KFANMQ0X2REEDRMAKVQENEEKRLQLQNALTQKLBEMLRQRED 
LEQVRQEL Y QE EQAE I YKS KL KEE AEKKLRXQKEW KQDFEEQ MA 
LKBLVLQAAKEEEENFRKTMLAKFAEDDRIELMNAQKQRMKQLE 
HRRAVEKLIEBRRQQFLADKQRBLBEWQLQQRRQGFINAIIEEE 
RLKLLKEHATNLLGYLPKGVFKKEDDIDIiLGEEFRKVYQQRSEI 
CEEK 


6573 ■ 


767 


275 


GGGGGESQS FRAQDGTRTPATDCLMYLC^PRXLkTQGGYDMVQK 
LFLDFFRRRLSQR P TAEE LEQRN I L KPRNEQE EQEEKRE I KRRL 
TRKLSQRPTVEBLRBRKrLIRFSDYVEVADAQDYDRRADKPWTR 
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SEQ 
ZD 
170: 


predicted 
beginning 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 

amino ftr*irt 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C«Cyeteine, D=Aspartic Acid, E» 
Glutamic Acid, P= Phenyl alanine, G=Glycine, 
H=Histidine, I^Ieoleucine , KaLysine, 
LaLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RoArginine, 
S=Serine, T=Threonine, V=Valine, 
W»Tryptophan, Y«Tyrosine, X -Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6574 
- 6575 


204 


1159 


LTAADKVSRGECWRVGGRTVCWVSLGSPIjGSV 

LES S VP Vs VG VFWACGVS WTGAAGIX3 DGALSDTMARNAE KAMTA 

LARFRQAQIxEEGKVJCERRPFliASECTBLPKAEKWRRQIIGEISK 

KVAQIQNAGLGEPlilRDLNDEINKLLREKGHWEVRIKELGGPDy 

GKVGPKMLDHEGKEVPGNRGYKYFGAAKDLPGVRELFEKEPLPP 

PRKTRAELMKA1DFEYYGYLDEDDGVIVPLEQEYEKKLRAELVE 

KHKAEREAR LARGE KEEEEEE EEEIN I YAVTEEESDE EGS QEKG 

GDDSQQKFIAHVPVPSQQEIEBALVRRKKMELLQKYASETLQAQ 
SEEARRLLGY 




117 


820 


SPALASQSGGITEEiCMLEPQENGVIDLPDVEHVEDBTFPPPPPP 
ASPERQDGEGTEPDEESGNGAPVPVPPKRTVKRNIPKLDAQRLI 
S ERGLPALR HVFDKAXFKGKGHEAEDLKMLI RHMEHWAHRLFPK 
LQFEDFIDRVEYLGSKKEVQTCLKRIRIiDLPILHEDFVSNNDEV 
AENNEHDVTSTELDP FLTNLS E S E M FASELS 1 S LTEEQQQR IER 
NXQLALERRQAKLP 


6576 


1 


1060 


P E PQALVG OKRGALRLL VARI>VLTVSAPAE VRRRVLRP VLS fiRi) 
RET RALADSHFRGLG VDV PGVGQAPGR VAFVS E PGAFS YADFVR 
GFLLPNLPCVFSSAFTQGWGSRRRWVTPAGRPDFDHLLRrYGDV 
WPVANCGVQEYN SNPKEHMTLRD YITYWKE Y I QAG YS S PRGCL 
YLIO?WHLCRDFPVBDVFTLPVYFSSDWIjNEFWDALDVI)DYRFVY 
AGPAGSWS PFHADI FRS FS WSVNVCGRKKWLLFPPGQEEALRDR 
HGNLP YDVTS PALCDTHLHPRNQLAG P PLE ITQEAGEMVF V PSG 

WHHQVHNLVMCCFSCPLSGAFLQEDGSTTSPLSQPELGWNGUAH 
G 


6577 


2271 


987 


SDRMASDbPDWlEAMLEAPYKKEEDEWRKEVKKDYPSNTTSS 
TSNSGNBTSGSSTIGETSNRSRBRDRYRRRNSRSRSPGRQCRHR 
SRSWDRRHGSESRSRDHRREDRVHYRSPPLATGYRYGHSKSPHF 
REKS P VRE P VDNLS PEERDARTVFCMQLAAR I RPRDLEDFFS AV 
GKVRDVRI 1SDRNSRRSKGIAYVEFCEIQS VPLAIGLTGQRIjLG 
VP 1 1 VQAS QAEKNRIiAAMANNLQKGNGGPMRL YVGS LHFNI TED 
rMLRGI rePFGKIDNIVLMKDSDTGRSKGYGFITFSDSECARRAL 
EQLNGFEIiAGRPMRVGHVTERLDGGTDITFPDGDQELDLGSAGG 
RFQLMAKLAEGAGIQLPSTAAAAAAAAAAQAAALQLNGAVPIjQA 
LNP AALTALS PALNLAS Q CLQ LSS LFTPQTM 


6576 


377 


1489 


PSSSATfdNRAPIjKRATILHMALTGASDPSAEAEANGEKPFIiLRA 
LQ IALWS LYWVTS I SMVFLNKYLLDS PSLRLDT P I FVTFYQCL 
VTTLLCKGLSALAACCPGAVDFPSLRLDLRVARSVLPLSWFIG 
MITFNNLCLKYVG VAFYNVGRS LTTVFNVLI^ YLLLKG/TTS FYA 
LLTCGI I IGGFWLGVDQEGAEGTLSWLGTVFGVLASLCVSLNAI 
YTTKVLPAVDGS IVTRliTFYNNVNACILFLPLLLLLGELQALRDF 
AQLGSAHFWGMMTLGGLFGFAIGYVTGLQIKFTS PLTHNVSGTA 

KACAQTVLAVLYYEETKSFIjWWTSNMMVIjGGSSAYTWVRGWEMK 
KTPEEPS PKDSEKSAMGV 


6579 


2 


711 


RPPRVWYPELRKLSAAAPRWSHRTAPGIMVFYFTSSSVNSSAYT 

IYMGKDKYENEDLI KHGWPEDIWFHVDKLSSAHVYLRLHKGENI 

EDIPKEVTiMDQiWT.VTriXKic Trv^Pi/Tunvnvn nvnnrvrpnT.mik'fT inrm« . 

» jjj iLj\ mm nsi±j ¥ cu-iri^> xytaUitMXvw V si V V x 1 P WSNIjKKTADM 

DVGQIGFHRQKDVKIVTVBKiCVNEILNRLEKTKVERFPDLAAEK 
ECRDREERNEKKAQ IQEMKKREKEEMKKKREMDELRS YS S LMKV 
ENMSSNQDGNDSDEFM 


6580 "'" 


62 


1571 


LVALKNWKP KSiTNI PAPQSPVFGEAVSGVYMMTKVLGMAPVLGp 
RPPQEQVGPLMVKVEEKEEKGKYLPSLEMFRQRFRQFGYHDTPG 
PRE ALSQLR VLCCE WLRPE I HTKEQILE LIjVLEQFLTIL PQELQ 

AWVQEHCPESAEEAVTLLEDLERELDEPGHQVSTPPNEQKPVWE 
KrSSSGTAKESPSSMQPQPLETSHKYESWGPLYIQESGEEQEFA 
QDPRKVRDCRLSTQHEESADEQKGSEAEGLKGDIISVIIANKPE 
AS LERQCVNLENEKGTKPPLQEAGS KKGRES VPTKPTPGERRY I 
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SEQ 
ID 

i NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

t-n f \ ret- 

amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A~Alanine, C»Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine # 
HoHistidine, I*Isoleucine, ^Lysine, 
LaLeucine, M=Methionine, N^Asparagine, 
P^Proline, Q=Glutamine, R»Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X»TJnknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








CAECGKAPSNSSNLTKHRRTHTGEKPYVCTKCGKAPSHSSNLTL 
HYRTHLVDRP YDCKCG KAFGQS S DLLKHQRMHTEEAP YQC KD CG 
KA?SGKGSLIRHYRIHTGEKPYQCNECGICSFSQHAGLSSHQRLH 
TGEKPYKCKECGKAPNHSSNFNKHHRIHTGEKPYWCHHCGKTFC 
SKSNLSKHQRVHTGEGBAP 


6581 


22B 


476 


RVFLKDLSSTPMASNNTASIAQARKLVEQLKMEANIDRtKVSKA 
AAD LMAYCEAHAKE D P LLTP VPAS BNPFREKKFFCAIIi 


" 6582 


1428 


| 718 


CFTTKTHCS PVSVP YLS PLVLRKEtiESLLENEGDOVIHTSSF IN' 
QHPIIFWTLVWYFRRLDLPSNLPGLILTSEHCNEGVQLPLSSLfl 
QDSKLVYIQLLWDNINLHQEPREPLYVSWRNFNSEKKSSLLSBE 
QQETSTLVETIRQS IQHNNVLKPINLLSQQMKPGMKRQRS LYRE 
I LPLSLVSliGRENI DI EAFDNE YG I AYNSIiS SEILERLQKXDAP 
PSASVEWCRKCFGAPLX 


6583 


487 


41 


RIFS^SGRIiRWRCTWRPATALWSASLRLGTSSMHPSPRSISL'P 
LSMMLSPLPSNTRGLSPTAIiFRSPDSEHATSCPRLHLWRCRAPI* 
RSPS PLGRLQVLPRS PLHVHTHNSGKEVLGLQVQRSRSGTGPAC 
SQAGSGAVQGGNWCIF I 


6564 


189 


1750 


PLPMAAIX3PSSQNVTEYVVRVPKNTTKXYNIMAF^AADKVNFA1^ 
WNQARLERDLSNKKIYQEEEMPESGAGSEFNRKLREBARRKKYG 
IVLKEFRPEDQPWLLRVNGKSGRKFKGIKKGGVTENTSYYIFTQ 
CPDGAFEAFP VHNWYNFTP LARHRTLTAEEAEEEWERRNKVLNH 
FS I MQQRRL ECDQDQDE DEE EKE KRGRRKAS ElaRI HDLE DD LEMS 
SDASDASGEEGGRVPKAKKKAPLAKGGRKKKKKKGSDDEAFEDS 
DDGDFEGQEVDYMSDGSSSSQEEPESKAKAPQQEE3PKGVDEQS 
DSSEESEEEKPPEEDKEEEEEKKAPTPQEKKRRKDSSEESDSSE 
ES D I DSEASS AF FMAKKKT PP KRBRKPSGGSSRGNSR PGTPS AB 
GGSTSSTLRAAASKLBQGKRVSEMPAAKRLRLDTGPQSLSGKST 
PQP PSGECTTPNSGDVQVTEDAVRRYLTR1CPMTTKDLLKKFQTKK 
TGLSSEQTVNVLAQILKRLNPERKMINDKKHFSLKE 


6585 


• 3 


1678 


GP IRNSR IDDF VGGDPRAEAS CS VLHS KPHAMADSRDPASDQMQ 
HWISQRAAQKADVI »TTGAGNPVGDKLNVITV§PRG PLLVQDVVF 
TDEMAHFDRBRI P E R WHAKG AGA FG Y FEVTHD I TKYS KAKVFE 
HZGKZTPIAVRFSWAGESGSADTVRDPRGFAVKFYTEDGNWDL 
VGNNTPIFFIRDPILFPSFIHSQKRNPQTHLKDPDMVWDPWSLR 
P ES LHQVS FLF3DRGI PDGHRHMNGYG SHTFKL VNANGEAVYCK 
FHYKTDQGIKNLSVEDAARLSQEDPDYGIRDLFNAIATGKYPSW 
TFYIQVMTFKQAETFP FNPFDLTKVWPHKD YPL I PVGKL VLNRN 
PVNYFAEVEQIAFDPSNMPPGXKASPDKMLQGRLFAYPDTHRHR 
LGPN YLHI PVNCP YRAR VANYQRDGPMCWQDNQGGAPNYYPNS F 
GAPEQQPSALEHS IQYSGEVRRFNTANDDNVTQVRAFYVNVLNE 
EQRiCRLCENIAGRXKDAQI F I QKKAVKNFTEVHPD YGSHIQALL 
DKYNAEKPKNAIHTFVQSGSHLAAREKANL 


6586 


32 


804 


PLPEQPASSTSTMPVSGTPAPNKKRKSSKLIMELTGGGQESSGL " 
KLGKKI S VPR DVMLEELS LLTNRGS KMFKLRQMRVEKFI YENH P 

SGSAGQYGS DQQHHLGSGSGAGGTGGPAGQAGRGGAAGTAG VGB 
TGStsDQAGGEGKHITVFKTYISPWERAMGVDPQQKMBLGIDLLA 
YGAKAELP KYKS FNRTAMP YGGYEKASKRMTFQMPKV 


" 6587 "- 


75 | 


1117 


rrvpslgkmpecwdgehdietpygllhwirgspkgnrpailty" 

hdvglnhklcfntffnfedmqeitkhfvvchvdapg<x3vgasqf 

pqgyqfpsmeqlaamlpswqhfgfkyvigigvgagayvlakfa 

lifpdlveglvlwidpngbcgwidwaatklsgltstlpdtvlsh 

lfsqeelvnntelvqsyrqqignvvnqanlqlfwnmynsrrdld 

inrpgtvpnaktlrcpvmlwgdnapaedgvvecnskldptttt 

flkmadsgglpqvtqpgklteafkyl^l^mgympsasmtrlars 

rtasltsassvdgsrpqacthsbssbglgqvnhtmevsc 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide I 
iH HXdiune, *-=uysteine, D=Aspartic Acid, E= 1 
Glutamic Acid, F»Phenylalanine , G=Glycine, 
H«Hietidine, I«Isoleucine, K*I,ysine, \ 
L«Leucine, M=Methionine, N^aAsparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine. T»Threonine, V=Valine, 
^-Tryptophan, Y*Tyrosine, X=Unknovn, *-Stop 
Codon, /^possible nucleotide deletion, j 
Vpossible nucleotide insertion) 


6588 


13 7 


501 


LG LQ AQ LL E LiRTNN YQLS DELR KNG VE LTS LRQ KVA YL DKE FS K 
AQKAIjSKSKKAQEVEVLLSENEMLQAKIiHSQEEDFRLQNSTLMA 
BFSKLCSQMEQIjEQENQQLKEGAAGAGVAQAGP J 


6589 


2 


1405 


RPWGSAMATFSRQEFFQQ^QddLLPTAO^GliDQIwLLl^ICIiA 
CRIjLWRLGLPSYLKHASTVAGGFFSLYHFPQLHMVWVVLIiSLLC 
Y L VL FLCRHS S HRG V FL, S VT I L I YLLMG EJWMVDTVT WHKM RGA 
QMIVAMKAVSLGFDLDRGBVGTVPSPVEFMGYLYFVGTIVFGPW 
ISFHSYlXJAVO^PIiSCRWLQKVARSriAlALLCLVLSTCVGPYL 
FPYFIPLNGDRLLRNKKRKARGTMVRWLRAYESAVSFHFSNYFV 
GFIiSEATATlAGAGFTEEKDHLEWDLTVSKPLNVBLPRSMVEVV 
TSWNLPMSYWLNNYVFKNALRLGTFSAVI,VTYAASAI»LHGFSFH 
LAAVLLS LAF I TYVEHVLRKRLAR ILSACVL S KRCPP DCS HQHR 
LGLGVRALNLLFGALAI FHLAYLGSLFDVDVDDTTBEQGYGMAY 
TVHKWSELSWASHWVTFGCWIFYRLIG 


6590 


11 

*x / / 


656 


VRAYEHVIi S LLENV FT P M F CHRD E Y FRQLLRGABS PTRNS KLNR 
GSLSLDDFRNTQKRGES FGISR IGSKI KGVFKSTTMEGAMIjPN Y 
GVAEGEDDFIEEGI WMEDDS PVEAVSTPNTPRNLAAWKI S IPY 
VDFFEDPSSERXEKKERIPVFCIDVERNDRRAVGHEPEHWSVYR 1 
RYLEFYVLESKLTEFHGAFPDAQLPSKRIIGPKNYEFLKSKREE 
FQEYLQKIXQHPEl^NSQIiLAOFLSPNGGETQFLDKILPDVNIiG 
KI IKSVPGKLMJCEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
S r^SE^mKKLF^TDLFKNNA^mAENTERKQNQNYFME VMTVE<^VY 
DYLMYVG^WF^VPDWI^LLMGTRILFKNTlfiMYTDYYLQCKL 
EQLFQEHRLVSLITLLRDAIFCENTEPRSLQDKQKGAKQTFEEM 
MNYIPDLLVKCIGEETKYESIRLLFDGLQQPVLNKOLTYVLLDI 
VIQELFPELNKVQKEVTS VTSWM | 


6591 


O T 11 
&X f I 


656 


vrayehvlsklsnvptpmfchrdeyfrqLlrgaesptrnski^rH 

GSLSLDDFRNTQKRGESFGISRIGSKIKGVFKSTTMEGAMIiPNY 
GVAEGEDDFIEEGI VVMEDDS PVEAVSTPNTPRNLAAWKI S IPY 
VDFFEDPSSERKEKKBRI PVFCIDVERNDRRAVGHEPHHWSVYR 
RYLEFYVLESKLTEFHGAFPDAQLPSKRIIGPKNYEFLKSKREE 
FQBYIiQKLLQHPBLSNSQLLADFLSPNGGBTQFLDKILPDVNLG 
KI I KS VPGKLMKEKGQHLE PFIMN F INSCE S P KPKPSRPBLTIL 
SPTSENNfCKLFITOLFKNNANRABNTERiCQNQNYFMEVMTVEGVY 
DYLM YVG RWF Q VP DWLHHIiLMGT R I LFKNTLEMYTD Y YLQ C KL 
EQLFQEHRLVSLI TLLRDAI FCEN TEP RSLQDKQKGAKQTFEEM 
MNYIPDLLVKCIGEETKYESIRLLFDGLQQPVLNKQLTYVLLDI 
VIQEL FPELNKVQKEVTS VTSWM j 


6592 


3 


1661 


APEFIiGSTISSGSMIDANLKI.LQKAEQRLKAIVAEiCFAIATKEG 
DLPQVBRFFKIFPLLGI^EEGLRKFSEYLCKQVAS5CAEENLLMV 
LGTDMS DRRAAVI FADTLTLLFEG IAR IVETHQP I VET YYG PGR 
L YTL I K YLQVECDRQVE KWDKF I KQRDYHQQFRHVQNNLMRNS 1 
TrEKIEPRBLDPILTEVTLMNARSELYLRFLKKRISSDPEVGDS 
MASEEVKQEHQKCLDKLLNNCLLSCTMQELIGLYVrMEBYFMRE 
T VNKAVALDTYEKGQLTS S M VDD VFY I VKKCIGRALS SS S I DCL 
CAMINLATTELESDFRJDVLCNKLRMGF PATTFQDIQRGVTSAVN 
IMHSSLQO^KFDTKGIESTDEAKMSFLVTLNNVEVCSENISTLK 
KTLES DCTKLFS QG I GGEQAQAKFD S CLSDLAAVS NKFRDLLQE 
GLTELNSTAIKPQVQPWINSFFSVSHNIEEEEFNDYEANDPWVQ 
QFILNLEQQMAE FKASLS PVI YDSLTGLMTSLVAVELEKVVLKS 
TFNRI^GI^FDKELRSLIAYLTTVTTWTIRDKFARLSQMATILN 

LERVTEILDYWGPNSGPLTWRbTTAEVRQVLALRIDFRSEDIKR 
LRL 1 


6593 


3 


1837 


EAFSAGSRRRGLALQRGVLGGLGGYCPCCCRRRGRLLVLLLLVR "I 
RGGEGGGGRGRGDKRRRRQARRQRRRPE PAEARGGKMADVLSVL 
RQYNIQKKEI\/VKGDEVIFGEFSWPKNVKTNYVVWGTGKEGQPR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AeAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H-Histidine, I-Isoleucine , K^Lysine, 
LoLeucine, M=Methionine, N=Asparagine, 
paProline, QnGlutaraine, R=Arginine, 
S*Serine, T=Threonine, V=Valine, 
N=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EYVTLDSILPIJiNNVHLSHPVYVRRAATENIPVVRRPDRKDLliG 
YLNGE AS TS AS I DRS APLE IGLQRSTQVKRAADEVLAEAKKPR I 
EDEECVRLDKERIAARLEGHKEGIVQTEQIRSLSEAMSVEKIAA 
IKAKIMAKKRSTIKTDLDDDITALKQRSFVDAEVDVTRDIVSRE 
RVWRTRTTI LQSTGKNFSKNI FAI LQS VKAREEGRAPBQRPAPN 
AAPVDPTLRTKQPI PAAYNRYDQERFKGKEBTEGPKIDTMGTYH 
GMTLKS VTEGASARKTQTPAAQP VPR P VSQARPP PNQKKGSRTP 
III I PAATTSLI TMLNAKDLLQDL KFVPS DEKKKQG CQRENETL 
I QRRKDQMQ PGGTAI S VTVPYRWDQPLKLMPQDWDRVVAVFVQ 
GPAWQFKGWPWLLPDGS PVDIFAKIKAFHLKYDEVRLDPNVQKW 
DVTVLELSYHKRHLDRPVFLRVWETLDRYMVKHKSHLRF | 


6594 


1 


1096 


EFPGRRFRGSQASPlidATCtoPALtiRAPTRAAMTRSLFKGNPWSA 
D I LSTIG YDN I IQHLNNGRKNCKE FED FLKERAAI EER YGKDLL 
NLSRKKPCGQSEINTLKRALEVFKQQVDNVAQCHIQIAQSLREE 
ARKMEEFREKQKLQRKKTELIMDAIHKQKSLQF1CKTMDAKKNYE 
QKCRDKDBAEQAVSRSANLVNPKQQEKLFVKIiATSKTAVEDSDK 
AYMLHIGTLDKVREEWQSEHI KACEAFEAQECERINFFRNALWL 
HVNQLSQQCVTSDEMYEQVRKSLEMCSIQRDIEYFVNQRKTGQI 
PPAP IMYBNFYS SQKNAVPAGKATG PNLARRGP LP I PKS S PDDP 
' NYSLVDDYSLLYQ 


6595 


57 


781 


PLGTMSDSDIjGEDEGLLSLAGKRKRRGNLPKESVKIIiRDWLiYIiH 
RYNAYPSEQE KLSLS GQTNLS VLQ I CNW FINARRRLL PDMLRKD 
GKDPNQFT I SRRGGKAS DVALPRGSS PSVLAVS VPAPTNVLSLS 
VCSMPLHSGQGEKPAAPFPRGELESPKPLVTPGSTLTLLTRAEA 
GS PTGGLFNTPPPTPPEQDKEDFSSFQLLVEVALQRAAEMELQK 
QQDPSLPLLHTP I PLVS ENPQ 


6596 


2 


1026 


PRLPVRRYHGRRRLGGRSRGHMAEGDAGSDQRQNEEiEAMAAIY 
GEEWCVIDDCAKIFCIRISDDIDDPKWTLCLQVMLPNEYPGTAP 
P I YQLMAPWIiKGQERADLSNSLEEI YIQNIGES ILYLWVEKIRD 
VL IQKSQMT EPGPDVKKKTEEEDVECEDDI; I LACQPES S VKALD 
FD ISETRTEVE VEELPP IDHGIP ITDRR5TFQAHLAPWCPKQV 
KMVLS KLYENKK IASAT HN I YAYR I YCEDKQT FLQDCEDDGETA 
AGGRLLHLMEILNVKNVWVWSR^GGIL^ 
ILVEKNYTNSPEESSKALGKNKKVRKDKKRNEH 


6597 


2 


1026 


prlpvrryhgrrrlqgrsrghmaegdagsdqrqneeieamaaiy 
geewcviddcakifcirisddiddpkwtlclqvmlpnbypgtap 

PIYQLNAPWLKGQERADLSNSLEEIYIQNIGESILYLWVEKIRD 
VLIQKSQMTEPGPDVKKKTEEEDVECEDDLILACQPESSVKALD 
FDISETRTEVEVEELPP IDHGI PITDRRSTFQAHLAPVVCPKQV 
KMVLSKLYENKKIASATHNIYAYRIYCEDKQTFLQDCEDDGETA 
AGGRLLHLME I LNVKNVMVWS RWYGGI LLGPDRFXHINNCARN 
ILVEKNYTNSPEESS KALGKNKKVRKDKKRNEH 


6598 


1099 


419 


PRVRWATTMAMS FEW PWQYRFPPFFTLQPNVDTRQKQLAAWCSL 
VLSFCRLHKQSSMTVMEAQESPLFNNVKLQRKLPVESIQIVLEE 
LrRKKGNliE WJjDKS KS S FL I MwRRPEEWG KL I YQWVSRSGQNNS V 
FTLYELTNGEDTEDEEFHGLDEATLLRALQALQQEHKAEIITVS 
DGPRRQVLLAGTCLPLLLTSHLSRAFKRRQTQCPPKTGSVTPPD 
SKGLQS 


6599 " 


164 


1593 


KMAALTTLFKYIDENQDRYI KKLAKWVAIQS VSAWPEKRGE IRR 
MMEVAAADVKQLGGSVELVDIGKQKLPDGSEIPLPPIIiLGRLGS 
DP QKKT VC I YGHLDVQP AAL EDGWDS EP FTLVERDGKLHGRGS 7 
DDKGPVAGWINALEAYQKTGQEI PVNVRFOiEGMEESGSEGIiDE 
LIFARKDTFFKDVDYVCISDNYt^LGKKKPCITYGLRGICYFFIE 
VECSNKDLHSGVYGGSVHEAMTDLILLMGSLVDKRGWILIPGIN 
E A VAAVTE EEHKLYDDIDFD IEEFAKDVGAQ I LLHS HKKD ILMH 
R WR YPS LSLHG IEGAFSGSGAKTV I PRKWGKFS I RLVPNMTP E 
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SEQ 
ID 

HO: 


Predicted 

faeoi tin i na 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine, C-Cysteane, D=Aspartic Acid, E=> 
Glutamic Acid, P* Phenyl alanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
LoLeucine, MaMethionine, NWlsparagine, 
PoProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine r X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








WGEQVTS YLT KKPAEL RS PNB FKVYMGHGGKP W VSD FS H PH YL 
AGRRAMKTVFGVEPDLTREGGS I PVTLTFQEATGKNVML LP VGS 
ADDGAHSQNEKLNRYWY IEGTKMLAAYLYBVSQLKD 


6600 


2 


934 


PGRL FRVAAMES AGLEQ LLRELLL PDTER I RRATEQ LQI VIiRAP 
AALSALCDLLAS AADPQ I RQFAAVLTRRRLNTRWRRLAAEQRES 
LKS LILTALQRETEHCV S h SLAQLSATI FRKEG LEAWPQLLQLL 
QHSTHSPHSPEREMGLLLLSVWTSRPEAFQPHHRELLRIiLNET 
I^EVGSPGLLFYSLRTLTTMAPYLSTEDVPLARMLVPKLIMAMQ 
TLI PIDEAKACEAIiEALDELLESEVPVITPYLSEVLTFCLEVAR 
NVALGNAIR IRIIiCCLTFLVKVKS KALLKNRLLATLAAHPPPHC 
GC 


6601 


529 


1420 


PRAAARAPPPAVLRRDRRAATAPGAGEMTLHGPLAQRYFLNHIE 
KI TTWQDPRKAMNQP LNHMNLH PAVS STPVPQRSMAVS Q PNLVM 
NHQHQQQMAPSTLSQQNHPTQNPPAGLMSKPNALTTQQQQQOKL 
RLQRIQMERERIRMRQEELMRQEAALCRQLPMBAETIAPVQAAV 
NPPTMTPDMRSITNNSSDPFLNGGPYHSREQSTDSGLGLGCYSV 
PTTPEDFLSNVDEMDTGENAGQ-rPMNINPQQTRFPDFLDCIiPGT 
NVDLGTLBSEDLI PLFNDVESALNKSE PFLTWL 


6602 


127 


617 


IiLD FPALPKFVIiAQS PKAGKPSTM TS MTQSLRE VI KAKTKARNF 
ERVLGKITLVSAAPGKVICEMKVEBEHTNAI GTLHGGLTATLVD 
NI STO1ALLCTBR3APGVSVDMNITYMSPAKLGEDIVITAHVLICQ 
GKTLAFTS VDIiTNKATGKli I AQGRHTKHLGN 


6603 


79 


660 


PVGPSSLAARTGLGHLPFlJiRIiASSRGI^^LUJFLAFLFVLLL 
SGMGATGTLRTSLDPSLEIYKKMFBVKRREQLIJVLXOTjAQLNDI 
HQQYKIIJDVMLKGLFKVLEDSRTVL7AADVLPDGPFPQDEKLKD 
AFSHWENTAPPGDWLRFPRI VHYYFDHNSNWNLLIRWGI S FC 
NQTGVFNQGPHS PILSLM 


6604 


3 


688 


TSTAQRQGGERMS FRGGGRGGFNRGGGGGGFNRGGS SNHFRGGG 
GGGGGGNFRGGGRGGFGRGGGRGGFNKGQDQGPPERWLLGEFL 
HPCEDDIVCKCTTDENKVPYFNAPVYLENKEQIGKVDEI FGQLR 
DFYFSVKESENMKASSFKKLQKFYIDPYKLLPLQRFLPRPPGEK 
GPPRGGGRGGRGGGRGGGGRGGGRGGGFRGGRGGGGGGFRGGRG 
GGFRGRGH 


6605 


7 


848 


sgsrrgamraagvgi.vix:hchlsapdfdrdlddvlekaxkanvv 
alvavaehsgefekimqlserytrafvlpciiavhpvqglppedqr 
s vtlkdldvalpi ienykdrllaigevgldfsprfagtgeqkee 
qrqvlirqiolakrlnlpvnvhsrsagrptiniilqeqg^kvlii 

HAFDGRPS VAMEGVRAGYFFS IP PS 11 RSGQQKLVKQLPLTS IC 
LETDS PALGPEKQVRNEP WNI S I S AB YI AQVKGI S VEE VIE VTT 
QNALKLFP KLRHLLQK 


6606 


i. 


1682 


FVEXRPRAEVANLSAHSASPIQDAVLKRLSLLEBIVYRQLNGLS 
KSLGL IEG YGGRGKGGLPATLS PAEEE KAKG PHEKYGYNS YLS E 
KISIxDRS I PDYRPTKCKE LKYS KDLPQ ISIIFI FVNEALSVI LR 
SVHSAVNHTPTHLLKEIILVDDNSDBEELKVPLBEYVHKRYPGL 

VLSRI QENRKRVILPS IDNIKQDNFEVQRYENSAHGYSWELWCM 
YISPPKDWMDAGDPSIJ>IRTPAMIGCSFVVNRKFFGEIGIjLDPG 

mdvyggenxelgikvwlcggsmevlpcsrvahierkkkpynsni 
gfytkrnalrvaevwmddykshvyiawhlplenpgidigdvser 
ralrkslkcknpqwyldhvypemrrynntvaygelrnnkakdvc 
ldqgplenhtailypchgwgpql^ytkegflhlgalxtttiilp 
dtrclvdnsksrlpqlldcdkvkssly1crwnfi0ngaimnkgtg 
rclevenrglag idli lrsctgqrwti kns i k 


6607 


137 


986 


VPACAGLKKEARSLx^PPRLLNTKLQASCRALFSPPIQSRQTT 
GI S FQGRGG AGPG VPTRTQVFAAMG AVMGTFS S LQTKQRRPSKD 
KIEDEIiEMTMVTHRPEGLEQLEAQTNFTKRELQVLYRGFKNECP 
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SEQ 
ID 

WW: 


Predicted 
beginning 
nucl eo t i fle 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{AoAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, l=lsoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=*Aspaxagine , 
P= Proline, QeGlutamine, R«Arginine, 
SsSerine, T-Threonine, V- Valine, 
W=Tryptophan, Y-Tyrosine, XoUnknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








SG WNEDT FKQI YAQFFPHGDASTYAH^LFNAFDTTQTGS VKF"B 
DFVTALSI LLRGTVHEKLRWTFNLYD INKDG YINQEEMMD IVKA 
r!fDMI4GKYTYPVlJCBDTPRQHVDVFFQKMDKNKDGIVTLDEFLE 
SCQEDDNIMRSLQLFQNVM 


6608 


224 


1140 


RPCFSSPTGLCPRLSYPMILLQHAVLPPPKQPSPSPPMSVATRS 
TGTLQLPPQKPFGQEASLPIiAGEEBLSKGGEODCAIiEBLCKPLY 
CKLCWTI^SAQOAOAHYQGKNHGKKLRNYYAANSCPPPARMSN 
WEPAATPWP VPPQMGS FKPGGRVI LATEKDYCKLCDASFSSP 
AVAQAHYQGKNHAKRLRLAEAQSNS FSESSBLGQRRARKEGNEF 
KMMPNRRNMYTVQNNSGPYFNPRSRQRIPRDLAMCVTPSGQFYC 
S M CNVGAG EEM E FRQHLE S KQHKS KVS EQR YRNBMENLG YV 


6609 


1 


443 


FRLRCRRFRVAGGRLAGAGLRESRVPAPEQRLSALrijtSWSAVT 
PAAEPGNFQLS PAEPRGPLASPVRAAPRAPCPAAEMSBLNTKTS 
PATNQAAGQEEKGKAGNVKKAEEBEEIDIDLTAPETEKAAIiAIQ 
GKFRRFQKRKKDPSS 


6610 


319 


881 


GRKSLCNLHIPIRPPLTYPDMYMGMMCTAKKCGIRFQPPAIILI 
YESEI KGKIRQR IMPVRNFSKFSDCTRAAEQLKNNPRHKS YLEQ 
VSLRQLEKLFSFLRGYLSGQSLAETMEQIQRETTIDPEEDIiNKL 
DDKELAKRKS1MDELFEKNQKKKDDPNFVYD1EVEFPQDDQLQS 
CGWDTESADBF 


6611 


978 


212 


PGCSGAGSRWJWI>PALRHLAMGSTESSEGRRVSFGVDEEERVRV 
LQGVRLSENVVNRMKEPSSPPPAPTSSTFGLQDGNLRAPHKEST 
LPRSGS SGGQQ PS GMKEGVKRYEQEHAAIQDKLFQVAKREREAA 
TKHSKASLPTGEGSISHBEQKSVRLARELESREAELRRRDTFYK 
EQLBRX ERKNAEM YKLSS EQFHEAAS KME5 TI K PRRVEP VCSG h 
QAQ ILHCYRDRPHEVLLCSDIiVKAYQRCVSAAHKG 


6612 


1724 


992 

r 


VSTHASALSRTQGQPQROPRAAASGAGAGTAGGGGSGGAEGSKM 
STEAQRVDDS PSTSGGSSDGDQRES VQQEPERE Q VQ PKKKEGK I 
SSKTAAKLSTSAKR1QKBLAEITLDPPPNCSAGPKGDNIYEWRS 
TII/5PPGSVYEGGVFFLDITFSPDYPFICPPKVTFRTRIYHCNiIN 
SQGVICLDILKDNWSFALTI S KVLLS I CS LLTDCNPADPLVGS I 
ATQYMTNRABHDRMARQWTKRYAT 


6613 


130 


748 * 


ELELSSNMPEQSNDYRVAVFGAGGVGKS SLVIjRFVKGTFRES YI 
PTVEDT YRQ VI S CDKS ICTLQ I TDTTGSHQ FPAMQRLS I S KGHA 
FIIiVYSITSRQSIiEEIiKPIYEQICEIKGDVSSIPIMLVGNKCDE 
S P S RB VQ S S EAEALAR'I*W KCAFMETS AKLNHNVKEL FQELLNLE 
KRRTVSLQIDGKKSKQQKRKEKLKGKCVIM 


6614 ■ 


3 


1191 


SSAAEAMRVLVRRCWGPPLAHGARRGRPSPQWRAIARLGWEDCR 
DSRVREKPPWRVLFFGTDQFAREALRALHAARENKEEELIDKLE 
WTMPSPS PKGL PVKQ YAVQS QL PVYE WPDVGSGE YDVGWAS F 
GRLLNEALILKFPYGIIJfVHPSCLPRWRGPAPVIHTVIjHGDTVT 
GVTIMQIRPKRFDVGPILKQETVPVPPKSTAKELBAVLSRLGAN 
MLISVLKNLPESLSWGRQQPMBGATYAPKISAGTSCIKWEEQTS 

PHT ZITfTMT TDT./'YPT' UMMVITT VT T nT \rc\rKtce\rr t\ r\r> vt m 

GQALIPGSVIYHKQSQILLVYCKIXSWIGVRSVMLKKSLTATDFY 
NGYLHPVryQKNSQAQPSQCRFQTLRLPTKKKQKKTVAMQQCIE 


6615 


B32 


35 


GRVGAGASAMSELPGDVRAFLREHPSIiRLQTDARKVRCII/TGHE 
LPCRLPELQVYTRGKKYQRLVRAS PAFDYAEFEPHI VPSTKNPH 
QLFCKLTLRHINKCPEHVLRHTQGRRYQRALCKYBECQKQGVEY 
VPACLVHRRRRRBDQMDGDGPRPREAFWEPTSSDEGGAASDDSM 
TDLYPPEL FTRKDLGS TEDGDGTDD FLTDKBDEKAKP PREKATD 
EGRRETTVYRGLVQKRGKKQLGSLKKKFKSHHRKPKS FS SCKQS 
G 


6616 


347 


1886 


LLPPCQGARPLSSPPHASEDNLFLFWNCILCAFPHPSPQPLQYP 
VWPLLLVITQI PAPRHLRNRPFSFSRGGLDSFSGSLSTPSI CRS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid. F=Phenylalanine, G^Glycine, 
H=»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, Methionine, N=Asparagine , 
P=Proline, Q«Glutamine, R«Arginine, 
SaSerine, T-Threonine, V«Valine, 
^Tryptophan, Y«Tyrosine, X=»Unknown, '-Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








PAWVKMAPWPPKGLVPAVLWGLSLFLNLPGPiWLQPSPPPQSSP" 
PPQPHPCHTCRGLVDSFNKGLERTIRDNPGGGNTAWEEENLSKY 
KDSETIUjVEVLEGVCSKSDFBCHRLLBLSE ELVES WWFHKQQEA 
PDLFQWLCSDSLKLCCPAGTFGPSCLPCPGGTERPCGGYGQCEG 
EGTRGGSGHCDCQAGYGGEACGQCGLGYPEAERNASHLVCSACF 
GPCARCSGPEESNCliQCKKGWALHHLKCVD IDBCGTEGANCGAD 
QFCVNTEGSYECRDCAKACLGCMGAGPGRCKKCSPGYQQVGSKC 
LDVDECETEVCPGEN KQ CENTEGG YRCI CAEG YKQMEG I CVKEQ 
I PES AGFFSEMTBDELWLQQMFFGI 1 1 CA.LATLAAKGDLVFTA 
IFIGAVAAMTGYWLSERSDRVLEGFIKGR 


6617 


118 


673 


VWMAWQVSLLBLEDRLQCPICLEVFKESLMLQCX3HSYCKGCLVS 
IiSYHLDTKVRCPMCWQAVDGSSSLPNVaiAWVIEALRLPGDPEP 
KVCVHHRNPIiSLFCEKDOELICGIiCGTiLGQHnmrPVTPT ewrc 

RMKEELAALFSELKQEQKKVDELIAKLVKNRTRIDGSAPSLCPC 
LGPATFTFL 


^18 - 


548 


13* 


dgkvarrapnspafqndiyplvsaprAttaeSpwskvlqntqcr 

NVPKMTSERSRXPCLSAAAAEGTGiOCQQEQRAMATLDRKVPSPE 
AFLGKPWSSWIDAAKLHCSDNVDLEEAGKEGGKSREVMRLNKEA 
WKYGT 


6613 


246 


842 


PAS S E VLTAAVMFLLLiNCIVAVSONMG I GKNGDL PR P PLRJNE FR 
YFQRMTTTSSVEGKQNLVIMGRKTWFSIPKKNRPLKDRlNIiVLS 
RELKEPPQGAHFLARSLDnALKLTERPELANKVDMIWIVGSGSSV 
YKEAMMHI/SHLKLFVTRIMQDFESDTFFSEIDIjEKYKIjLPEYPG 
ILSDVQEGKHIKYKFEVCEKDD 


6620 


3 

• 


1879 

r 


NSRVDDFVARARMAAENEASQE3AU3AYSPVDYMSITSFPRLPE 
DE PAPAAP LRGRKDEDAFLGDPDTDPDS FLKS ARLQRLPS SSSR 
KGSQDGSPLRETRKDPFSAAAAECSCRQDGLTVIVTACLTFATG 

EAAVAAALCLGIVAPHSSGLGGGGVMLVHDIRRNESHLIDFRES 
APGALREETIiQRSWETKPGLLVGVPGMVKGLHEAHQLYGRI»PWS 
QVliAFAAA VAQIX3 FNVTHDLARALAEQL P PNMS ERFRE T FLP S G 
RPPLPGSLLHRPDIAEVLDVLGTSGPAAFYAGGNLTLEMVAEAQ 
HAGG VI TE E D FSNYS AL VEKP VCG VYRGHL VLS PPPPHTGPALI 
SAIiNILEGFNLTSLVSREQALHWVAETLKlALALASRLGDPVYD 
STITESMDDMXjSKVEAAYLRGH indsqaapapllpvyeldgapt 
AAQVXiIMGPDDFIVAMVSSIjNQPFGSGLITPSGILLNSQMLDFS 

wpnrtanhsapslensvqpgkrplsfllptvvrpaeglcgtyla 
lgangaarglsgltqvrftpwlaffsrepscgldcrclsylwlv 
siphaanmg 


*621 


1 


662 


vqgitsyqqrlqalrkeksrdaarsrrgkknfefyeiiakllplp 
aaitsqldkasiirltisylkmrdfanqgdppwnlrmegpppnt 
s vkvi gaqrrrs ps alai evfeahlgsh i lqsldgyvfalnqeg 
kfl yisetvs i ylglsqveiitgssvfdyvhpgdhvemaeqiigmk 

LPPGRGLLSQGTAEDGASSASSSSQSETPEPWCFPPASDQFLL 


6622 


2 


319 


GRASGAQBETEAGGPERARAMEANMPKRKEPGRSLRIKVISMGN 
AEVGKSCIIKRYCEKRFVSKYLATIGIDYGVTKVHVRDREIKVN 
IFDMAGHPFFYEVRKPF 


6623 ■ 


1886 


189 I 


KALFEKVKKFRLHVEEGDILYAMYVRQTVLKVIKFLI I IAYNSA" 
LVSKVQFTVDCNVDIQDMTGYKNFSCNHTMAHLFSKLS FCYLCF 
VS I YGLTCLYTLYWLF YRSLRBYS FEYVRQETGFDD2 PDVKNDF 
AFMLHMIDQYDPLYS KRFAVFLS E VS ENKLKQLNLNNE WT PDKL 
RQKI^NAHNRLELPLIMLSGLPDTVFEITELQSLKLEI IKNVM 
IPATIAQLDNIiQELSLHQCSVKIHSAALSFLKENLKVIiSVKFDD 
MRELPPWMYGLRNLEEIjYLVGSLSHDISRNVTLESLRDLKSIiKI 
hS I KSNVSKI PQAWDVS SHLQKMCIHNDGTKLVMLNNLKKMTN 
LTELELVHCDLERIPHAVFSLLSLQELDLKENNLKSIEEIVSFQ | 
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SEQ 
ID 

NO: 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E=> 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
HaHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=*Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
SsSerine, TaThreonine, V«Valine, 
WaTryptophan, Y=Tyrosine, X«Uiiknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








HLRKIiTVLKLW HNS I T Y I PEHI KKLTSLERLS FSHNKI E VLPSH 
LFLCNKIRYLDLSYNDIRFIPPEIGVLQSLQYFSITCN1CVESLP 
DEIiYFCKKLKTLKIGKNSLSVLSPKIGNLLFLSYLDGKGNHFEI 
LPPELGDCRALKRAGL\nTCDALFETLPSDVREQMKTE 


£624 


218 




GSRRGGGSRIPAVSTHVAPGRS VLRPFASGAT.RTiR qt vie^m/in 
RGRPSGLAHLSQETSHWRAKRSGRACXGDFPGEIURSFIMKCTA 
RBWLRVTTVLFMARAI PAMWPNATIiLEKLLEKYMDEDGEWWIA 
KQRGKRAI TDNDMQS I LDLHNKLRSQVYPTASNME YMTWDVBLE 
RS AE S WAE S CLWEHG PAS LLPS IGQNLGAHWGRYRPPTFHVQSW 
YDEVKDFSYPYEHECM'PYCPPBpqf3P\7rTPrVTyw\7iJn»pcvmTr»r» 
AINLCHNMN IKGQ I WP KAVYLVCNYS PKGNWWGHAPYKHGRPCS 
AC P PS FGGG CR ENLCYKEGSDR YYPPREEETNE IERQQSQVHDT 
HVRTRSDDS S RNEVI S AQQMS Q IVS CE VRLRDQ CKGTT CNR YE C 
PAGCLDSKAKVIG5 VH YEMQSS I CRAAIHYGI IDNDGGWVDITR 
QGRKHY FI KSNRNG I QTIGKYQSANS FTVSKVTVQAVTCETTVE 
QLCPFHKPASHCPRVYCPRKLYASKSTLCSCNWNSSLF I 


6625 


1124 


543 


PGPRGGGGSLLSTKALdRSRGLGMHPGPSSGG^E^GVPTALRPP 
GPLVPSTSDDNLLKNI EL FDKLALRFHGRLLFLKDVLGDE I CCW 
S FYGQGRK I AEVCCTS I VYATEKKQTKVEFPEARI FEE TLNILI 
YETPRGPDPALLEATCGAAGAGGAGRGEDEENR EHRVRRI HVRR 
HITHDERPHGQQIVFKD 


; 6626 


3 


1498 


¥acv11 w tiuxiAa lav c*r i»v,5 JaitoDATMES I TACLKALQ AL 
IiDVPWPRSKIGSDQDSGIELLNVLHRVILTRESPSIQLASLEW 
RQI I CAAQEHVKEKRRS AEVDDGAAEKETLPEFGEGKDTGGLVP 
GKSLVFATLELCVCILVRQLPELNPKLTGSPGVKATKPQI LLED 
GSRLVSAALVI LS ELPAVCSPEGS I S ILPTIL YLTIGVLRETAV 
KLPGGQLSSTVAASLQALKG ILSS PMARAEKSRTAWTDLLRSAL 
TT1 LDCWDPVDETHOELDE VSLI/TA TTVPTTifi T Q DTTtTTf rvynr n 
KR C I DKFKATLE I KDP WQ I KTYQLLHS I FQ YPNPAVS Y P Y I YS 
LASCIMEKLQEIDKRKPENTAELEI FQBG I KVLETLVTVAEEHH 
RAQLVACbLPI LI SFLLDENSLGSATS IMRNLHDEALQNLMQIG 
PQYS S VFKSLVAS S PALKARLEAA I KGNQESVKVKIPTS K YTKS 
PGKNSSIQLKTSFL 


6628" 


1 


697 


GIPHLSSRDMTGTPGAVATRDGBAPERSPPCSPSYDLTGKVMLL ' 
GDTGVGKTCFL I QFKDGAFLSGTFIA1VGID FRl^KVVTVDGVRV 
KLQ I WDTAGQER FRSVTHAYYRDAQALLLL YD I TNKSS FDNIRA 
WLTE IHEYAQRDWIMLI/3NKADMSSERVIRSEDGETLAREYGV 
PF1VBTSAKTGMNVEIAFLAIAKELKYRAGHQADEPSFQ1RDYVE 
SQKKRSSCCSFM 




1 


1861 


QCAEFGGGSGGGGGSGGGGSGGGRGAGGEENKENERPSAGSKAN 
KEFGDSLSLEILQIIKESQQQHGLRHGDFQRYRGYCSRRQRRLR 
KTLNFKMGl^KFTGKKVTEELLTO 

QLKQEANTEPRKRFHLLSRLRKAVKHAEELE RLCESNRVDAKTK 
LE AQAYTAYLSGMLRFEHQE W KAAI EAFNKCKT I YEKLASAFTE 
EQAVLYNQRVEEISPNIRYCAYNIGDQSAINELMQMRLRSGGTE 
GLLAEK1EALITQTRAK0AATMSEVEWRGRTVPVKIDKVRI FLL 
GLAllNEAAl^QAESEETKERLFESMLSECRDAIQVVREELKPDQ 
KQRDYII^GEPGKVSNLQYLHSYLTYIKLSTAIKRlOTMAKGIjQ 
RALLQQQPEDDSKRS PRPQDL I RL YD I ILQNLVE LLQLPGL E ED 
KAFQKE IGLKTLV FKAYRC F F I AQS YVLVKKWS EALVL YDR VLK 
YANEVNSDAGAFKNS LKDL P DVQBL ITQVRSEKCSLQAAAI LDA 
NDAHQTETSSSQVKDNKPLVERFETFCLDPSLVTKQANLVHFPP 
GFQPIPCKPLFFDLAIiNHVAFPPLEDKLEQKTKSGLTGYIKGIF 
GFRS 


6629 


5653 


4549 


GAT PLG S VGGRTG KMDAATLT YDTLR FAE FED FPETS E PVW I LG 
RKYSIFTEKDEILSDVASRLWFTYRKNFPAIGGTGPTSDTGWGC 
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ID 
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beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
co ire spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I*Isoleucine, K=Lysine, 
L=i,eucine, M=Methionine, N^Asparagine , 

SaSerine, T=Threonine, VsValine, 
W=Tryptophan, Y=* Tyrosine , X-Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 

^ "puoaijjj,g iiUwACOtluB inacrtlOIlJ 








MLRCX3QMIPAQALVCRHLQRDWRWTQRKRQPDSYPSVLNAFIDR 
KDS YYS IHQIAQMGVGBGKS IGQWYG PNTVAQVL KKLAVFDTWS 
SLAVHIAMDNTWMEBI RRLCTTS VPCAGATAFPADSDRHCNGF 

DA^A17H'!%IUnCBbTDT)T 1FT T TBI TIT m Tin T liTTl « \n rrunr 

KnuiUSv llMKrb rritUrLiVljUl PIjKIAsIjTUINKAYvETLKHCFMMP 
QSLGVIG^KPNSAHYPIGYVCTBLIYLDPHTTQPAVEPTDGCFI 
PDES FHOQHP PCRMS IAEIjDPS I AWRGGHLSTQAPQAECCLGM 
TRKTFGFLRFPPSMLG 


6630 


2 


423 


LVQCX3GIRRRSAWGAMPGRHVSRVRALYKRVLQLHRVLPPDLKS 
LGDQYVKDEFRRHICTVGSDEAO^PLQl^IiVYATALLOQANBNRQ 
NSTGKACFGTPIiPBEKLNDPRDKQ IGQLQELMQ EATKPNRQPS I 
SBSMKPKP 


" 6631 


2 


423 


LVOCGGIRRRSAWGAMPGRHVSRVRALYKRVLQLHRViPi>DIiKS 
IX5DQYVKDEFRRHKTVG3DEAQRFLQEWEVYATALLQQANENRQ 
NS TG KACPGT PL? E EKLNDFRDEQ 1 GQLQELMQEATKPNRQFS I 
SESMKPKF 


" 6632 


1273 


568 


WNSRGRTQRGAAPLAPAAAM jkAWQRV^TRAS Vt^GG^QI S A£GR 
GIC^I/SISLEDTQKELEHMVRKILNLRVFEDESGKHWSKSVMD 
KQ YE I LCVSQFTLQ CVLKGNKPD FHLAMPTEQAEG FYNS FLEQL 
RKTYRPELIKDGKFGAYMQVH IQNDG P VTIELESPAPGTATSDP 
KQLSKLEKQQQRKEKTRAKGPSESSKERNTPRKEDRSASSGAEG 
DVSSEREP 


* 6633 


1144 


617 


ATGRHEGVPTLEGI IQQLVNGI ITPATI PSLGPWGVIjHSNPMD Y 
AWGANGLDAI ITQLLNQFENTGPPPADKEKIQALPTVPVTEERV 
GSGLE CP VCKDDYALGERVRQLPCNHLFHDG CI VPWLEQHDSCP 
VCRKSLTGQNTATNPPGLTGVSFSSSSSSSSSSSPSNENATSNS 


6634 


1 


1134 


CGGI PRKGSGPRRRLPMARLRDCLPRIjMLTLRSLLFWSLVYCYC 
GLCAS IHLLKLLWSLGKGPAQTFRRPAREHPPACLSDPSLGTHC 
YVRI KDSGIiRFHYVAAGBRGKPLMLIjIjHGFPEFWYSWRYQLRBF 
KSEYRVVALDLRGYGETDAPIHRQNYKIiDCLITDIKDILDS lgy 
S X CVL I GHDWGGM IAWL IA I CYFEMVMKL I VINF PHPNVFTEYI 
TiRHPAQDLKS S YYYFFQI PWFPEFM FS INDFKVLKfiLFTS HSTG 
IGRKGCQLTTEDLEAYI YVFSQPGALSGPINHYRNI FSCLPLKH 
HMVTT PTLLL WG END AFMEVEMAE VTR FYVKNY FRliTIL S EAS H 
WL»QQDQPDiVNiQjIWTFIiKEETRKKD 


6635 * 


1420 


470 


EMRAGQQLASMLRWTRAWRLPREGIjGPHGPSFAR VP VAP S S S SG " 
GRG GAEPRPLPLS YRIjIiDGEAALPAVVFLHGLFGS KTNFNS IAK 

ilaqqtgrrvltvdarnhgds PHSFDMSYEIMSQDLQDLLPQLG 
LVPCVVVGHSMGGKTAMLLAIjQRPELVERLI AVD IS PVESTGVS 
HFATYVAAMRAIWIADELPRSRARKLADEOIiSSVIQDMAVRQHL 
LTNLVEVDGRFWRVNLDAIiTQHLDKILAFPQRQESYLGPTLFL 
LGGNSQFVHPSHHPEIMRLFPRAQMQTVPNAGHWIHADRPQDFI 
AAIRQFXiV 


6636 


1514 


1801 


SFCMFSHKQDSHFQAVPVQEKKKRIiRRAPWRAFAQPQRLKHPAE 
QPIVRQCI^RPPLCXSVLGPVQ^QLPPSLGPVLSPHSDPGWCRVD 
DGGDGVF 


6637 


2 


1501 


CSSS PCFHDGTCVLDKAGS YKCACLAGYTGQRCENLLBAGKSKI 

KASEDSIiSVIiEERNCSDPGGPVNGYQKITGGPGLINGRflAKIGT 

VVSFFCNNSYVLSGNBKRTCQQNGEWSGKQPIClKACREPiaSD 

L VRRRVLPMQVQSRETPLiHQL YS AAFS KQ KLQSAPT KKPAL P FG 

DLPMGYQHLHTQIiQYECISPFYRRLGSSRRTCLRTGKWSGRAPS 

CIPICGKIENlTAPKTOJ3LRWPWQAAIYRRTSG\n^SLHKGAW 

PLVCSGALVNERTVVVAAHCVTDI^KVTMIKTADLKWLGKF 

DDDRDEKTIQSLQISAlILHPNYDPILLnADIAILKLLDKARIS 

TRVQPICIiAASRDIiSTSFQESHITVAGWNVLAnVRSPGFKNDTL 

RSGWSWDSLLCBEQHEDHGIPVSVTDNMFCASWEPTAPSDIC 

TAETGG IAAVS FPGRAS PEPRWHLMGLVSWS YDKTCSHRLSTAF 
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nucleotide 
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to first 
amino acid 
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amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F* Phenylalanine, G»Glycine, 
H=Histidine, I»Isoleucine, K=*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V= Valine , 
WaTryptophan, Y«Tyrosine, X« Unknown, *oStop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








TKVLPFKDW I ERNMK 


6638 


1391 


224 


GG I PQAGGKMAAPWWRAALCBCRRWRGFSTS AVLGRRTPPLGPM 
PNSDIDLSNLERLEKYRSFDRYRRRAEQBAQAPHWWRTYREYFG 
EKTD P KEK I D I G L PPPKVS RTQQLLERKQAI QELRANVEE ERAA 
RLRTASVPLDAVRAEWERTCGPYHKQRLAEYYGLYRDLFHGATF 
VPRVPLHVAYAVGEDDLMPVYCX3NEVTPTEAAQAPBVTYEAEEG 
SLWTLLLTS LDGH LLE PDAE YLHWLLTNI PGNRVAEGQVTCPYL 
PPFPARGSG IHRLAFLLFKQDQP I DFSEDARPSPCYQLAQRTFR 
TFT)PYKKHQETMTPAGLSFFQCRWDDSVTYIFHQLIjDMREPVFE 
FVRPPPYHPKQKRFPHRQPLRYLDRYRDSHEPTYGIY 


6639 


2046 


1266 


igc^imdggddgnli ikkrfvseaelderrkrrqeewekvrkpe 
dpeecpeevydprslybrlqeqkdrkqqeyeeqfkfknmvrgld 

EDETNFLDE VS RQQEL I EKQRREEELKELKBYRNNLKKVG I SQE 
NKKEVEKKLTVKP I ETTKNKFS QAKLLAGAVKHKS S ESGNS VKRL 
KPDPEPDDKNQEPSSCKSIiGNTSLSGPS IHCPSAAVCIG ILPGL 
GAYSGSSDS ESSS DS EGTINATGKI VSS I FRTNTFLEAP 


6^40 


117 


1643 


vleppdVsmaesedrslrivlvgktgsgksatantilgeeifds 
riaaqavtkncqkasrewqgrdllvvdtpglfdtkesldttcke 

ISRCIISSCPGPHAIVLVLLLGRYTEEEQKTVALIKAVFGKSAM 
KHMVILFTRKEELEGQSFHDFIADADVGLKSIVKECGNRCCAFS 
NSKKTSKAEKESQVQEIiVELIEKMVQCNEGAYFSDDIYKDTEER 
LKQRE EVLRKI YTDQLNEE I KLVEEDKHKSEE KKE KEI KLIiKLK 
YDEKIKNIREEAERNIFKDVFNRIWKMLSEIWHRFIjSKCKPYSS 


*641 


1 


694 


1 SAAVGRRSBVRGCAPRPRLRRSAimMDPVPGTDSAP^iLAWSS 
ASAPPPRGFSAISCTVEGAPASFGKSFAQKSGYFLCLSSLGSLE 
NPQENVVADIQIVVDKSPLPI^FSPVCTOPMDSKASVSKKKKMCV 

kllplgatdtavfdvrlsgktktvpgylrigdmggfaiwckkak 

APRPVPKPRGLS RDMQGLS LDAASQP SKGGLLERTAS RLGSRAS 
TLRRNDS I YEASSLYG I SAMDGVP FTLHPR FEGKS CS PLAFS AF 
GDLTI KSLADIEEEYNYGFWEKrAAARLPPSVS 


S642 


22 


129* 


PLEERMMTkMDPNDQAQRDI IFELRRIAFDAESDPSNAPGSGTE 

KRKAMYTKDYKMLGFTNHINPAMDFTQTPPGMLALDN^ 

HQDTYIRIVLENSSREDKHECPFGRSAlEIiTKMLCEILQVGELP 

NEGRND YHPMFFTHDRAFEELFGI CI QLLNKTWKEMRATAEDFN 

KVMQ WREQ I TRALP SKPNS LDQ FKS KLRSLS YS E I LRLRQ9 ER 

MSQDDFQSPPIVELREKIQPEILELIKQQRLNRLCEGSSFRKIG 

NRRRQERFWYCRLALNHKVLHYGDLDDNPQGEVTFESIiQEKIPV 

ADIKAIVTGKDCPHMKEKSALKQNKEVLELAFSILYDPDETIJWF 

IAPNKYEYCIWIDGLSALLGKDMSSELTKSDLDTLLSMEMKLRL 

LDLENIQIPEAPPPIPKEPSSYDFVYHYG 


6643 


3049 


2265 


SLHAPAEGRTRGRLAEKPKMLTRKIKLWDINAHITCRLCSGYIjI 
DATTVTECIiHTFCRSCLVKYIiEENNTCPTCRIVIHQSHPLQYIG 
HDRTMQDI VYKLVPGLQEAEMRKQRE FYHKLGMEVPGDI KGETC 
SAKQHLDSHRNGETKADDSSNKEAAEEKPEEDNDYHRSDEQVSI 
CLE CNS S KLRG L KRKW IRCS AQ ATVLHLKKF I AKKLNLS S FNEL 
DILCNEEIl^KDHTLKFWVTRWRFKKAPIiLLHYRPKMDLL 


6644 


1489 


290 


FRPIxATEPRGSSPVQLVSSTMSVRTLPLLPLNLGGEMIiYILDQR 
LRAQN I PGDKARKVLND IIS TMFNRKFME ELFKPQE LYS KKALR 
TVYE RLAHAS I MKLNQ AS M D KL YDLM TMA FKY Q VLLCPRP KDVL 
LVTFNHLDTIKGFIRDSPTI LQQVDETLRQLTE I YGGLSAGEFQ 
LIRQTLIjI FFQDLHI RVSM FLKDKVQNNNGRFVLPVSG PVPWGT 
EVPGLIRMFNNKGEEVKRIEFKHGGNYVPAPKEGSFEFYGDRVL 
KLGTNMYSVNQPVETHVSGSSKNLASWTQES IAPNPLAKEELNF 
LARLMGGMEIKKPSGPEPGFRLNLFTTDEEEEQAALTRPEELSY 
E VI N I QATQDQQRSEELAR I MGE FE ITEQ PRLSTS KGDDLLAMM 
DEL 
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to first 
amino acid 
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amino acid 
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Predicted end 
nucleotide 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K»Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W»Tryptophan, Y«Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=»possible nucleotide insertion) 




6530 

• 


4646 


FVEGLAGYVYKAASEGKVLTIAALLLNRSESDIRYLbGYVSQQG 
GQRSTPLI IAARNGHAKVVRLLLEHYRVQTQQTGTWFTXoYVID 
GATALW CAAGAGHPK VVKJjL VSHGANVNHTTVTNSTP LRAACFD 
GRLDIVKYLVENNANIS I ANKYDNTCLMIAAYKGHTDVVRYLLE 
QRADPNAKAHCGATALHPAAEAGHIDI VKELI KWRAAI WNGHG 
MTPLKVAAESCKADWELLLSHADCDRRSRIEALELLGASFAND 
REN YDI IKTYHYLYLAMLERFQDG DNI LE KE VLPPIHAYQNRTE 
CRN PQELES I RQDRDALHMEGLI VRER I LGADN I DVSHP XX YRG 
AVYADNME FEQCI KL WLHALHLRQKGNRNTHKDLLRFAQVFSQM 
IHLNETVKAPDIECVLRCS VLEI EQSMNRVKNI SDADVHNAMDN 
YECNLYTFLYLVCISTKTQCSEEDQCKINKQIYNLIHLDPRTRE 
GFTLLHLAVNSNTPVDDFHTNDVCS FPNALVTKLLLDCGAEVNA 
VDNEGNSALHX IVQYNRPISDFLTLHS II ISLVEAGAHTDMTNK 
QNKTPLDK5TTGVSEILLKTQMKMSLKCLAARAVRANDINYQDQ 
I PRTLEEFVGFH 


6S46 


176 


890 


pssrmnhlpedmenaltgsqsshaslrnihsinptqlMaries'y 
egrekkgisdvrrtfclfvtfdllfvtllwiielnvnggientl 
ekevmqyd yys s yfdi fllavfr fk vli layavcrlrh w waial 
ttavts afllakvt ls klfsqg afg yvlp 1 1 s f i law iet wfld 
fkvlpqeaeeenrllivqdaseraalipgglsdgqfysppesea 
gseeaeekqdsekpllel 


6647 


176 


890 


PS SRMNHLPEDMENALTGSQSSHAS XjRNI KS I NP TQLMARI ES Y 
EGREKKGISDVRRTFCLFVTFDLLFVTLLWIIELNVNGGIENTL 
EKEVMQYD YYS S YFDI FLLAVFRFK VL I LAYAVCRLRH WW AIAL 
TTAVTSAFLIjAKVTLS KLFSQGAFG YVLP I IS F I LAW I ET W FLD 
FKVLPQEAEEENRLLIVQDASERAALIPGGLSDGQFYSPPESEA 
GSEEAEE KQDS EKPLLEL . 




413 


897 


RNCWNCFTKYFNSPPEDIDHKDS YLITRS IMAEPDY IEDDNPEL™" 
IRPQKLINPVKTSRNHQDZjHRELLMNQKRGLAPQIJKPBLQICVME 

krkrdqvi kqkeeeaqkkksdle ie llkrqqkleqlelekqklq 
eeqenapepvkvkgnlrrtgqevaqaqes 


6619 


1357 


832 


WIPRAAGIRHEVKWDVKEIMSQHNIYVDALLKEFEQFNRRLNEV 
SKRVRIPLPVSNILWEHCIRIjANRTIVEGYANVKKCSNEGRALM 
QLDFQQFTMKLEKLTDIRPIPDKEFVETYIKAYYLTENDMERWI 
KEHRE YSTKQLTOTiVNVCLGSHINKKARQKLLAAIDD IDR PKR 


fttSQ 


32 


765 


LVPL VFS LLVQSCKQVYRS I AMKF VP CLLLVTLS CLGTLGQAPR 
QKQGSTGEEFHFQTGGRDSCTMRPSSLGQGAGEVWLRVDCRNTD 
QTYWCEYRGQPSMCQAFAADPKSYWNQALQELRRLHHACQGAPV 
LRPSVCREAGPQAHMQQVTSSLKGSPEPNQQPEAGTPSLRPKAT 
VXLTEATQI<GKDSMEELGKAia?TTRPTAKP7QPGPRPGGNEEAK 
KKAWEHCWKP FQALCAFLI S F FRG 


6^51 


3425 


1353 


AKELLXVGDFSLCAGP YQNTADTMENLSKBPLAS FVSESFDISA 
CG I ATEHVKI DNS GEGLTAEAGS ETLS RDGEVGVNS DMHYE LSG 
DSDLDLLGDCRNPRLDLEDS YTLRGS YTRKKDVPTDGYESSLNF 
HNNNQEDWGCSSWVPGWETSLPPGHWTAAVKKEEKCVPPYVQIR 
UUivj X liK I YAW KS ITKSLKDTMRTSHGLRPJiPSFSANCGL PS SW 
TSTWQVADDLTQNTLDLEYLRFAHKLKQTIKNGDSQHSASSANV 
FPKESPTQISIGAFPSTKISEAPFLHPAPRSRSPLLVTWESDP 
RPQG Q PRRGYTAS SLDS S S S WRERCS HNRDLRNS QRNHTVS FHL 
NKLKYNS TVKESRND ISL I LNE YAEFNKVM KNSNQF IFQD KELN 
DVSGEATAQEMYLPFPGRSAS YEDI I IDVCTNLHVKLRS WKEA 
CKSTFLFYLVETEDKS FFVRTKNLLRKGGHTE I E PQHFCQA FHR 
ENDTLIIIIRNEDISSHLHQIPSLLKLKHFPSVIFAGVDSPGDV 
LDHTYQELFRAGGFVTSDDKI LEAVTLVQLKEI I KI LEKLNGNG 
RWKWLLHYRENKKLKEDERVDSTAHKKNIMLKS FQSANIIELLH 
YHQCDSRS STKAfi I L KCLLNLQ IQHI DARFAVLLTDKPT I PREV 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
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location 
corresponding 
to first 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, NaAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T« Threonine, V=*Valine, 
W»Tryptophan, Y=Tyrosine, X« Unknown, +=Stop 
Codon, /«possible nucleotide deletion, 
\=possible nucleotide insertion) 








F ENNG I L VTD VNNFI EN I EKIAAP FRSS YW 


6652 


2 


1343 


ipgstiscschsrrlrggspaprLslgaa£prprppslplplpl 
pfplflptrpaerawirsrrasewvgkmevprldhalnsptspc 

EE VI KNL S LEA I QL CDRDGNKS Q DSG I AEM EELP VPHN I KI SN I 
TCDS FKI S WEMDS KSKDRITH YFI DLNKKENKNSNKFKHKDVPT 
KLVAKAVPLPMTVRGHWFLS PRTE YTVAVQTASKQVDGD YWS E 
WSE I IE FCTADYSKVHLTQLLE KAE VIAGRMLKFS VFYRNQHKE 
YFD YVREHHGNAMQPS VKDNSGSHGS P I SGKLEG I F FS CSTEFN 
TG KP PQDS P YGRYRFE IAAE KLFNPNTNLYFGDFYCMYT AYHYV 
I LVTAP VGS PGDEFCKQRLPQLNS KDNKFLTCTBEDGVLVYHHA 
QDVILEVIYTDPVDLSLGTVAEITGHQLMSLSTANAKKDPSCKT 
CNISVGR 


6653 

• 


170 


1910 


FFLEPRLRPFPASRARFVPARTRPSPLHPCCFCFEGGGSMLSPQ 
RVAAAASRGADDAMES SKPGPVQWLVOKDOHS FELDEKALASI 
LLQDHIRDLDVVVVSVAGAFRKG KS FI LDFMLR YLYSQKESGHS 
NWLGDPEEPLTGFSWRGGSDPETTGIQIWSEVFTVEKPGGKKVA 
WLMDTQGAFDSQSTVKBCATIFALSTMTSSVQIYNLSQNIQED 
DLQQLQLFTEYGRLAMDEIFQKPFQTLMPIjVRDWSFPYEYSYGL 
QGGMAFLDKRLQ V KEHQHEE IQNVRNHIH3 CFSDVTCFLLPHPG 
LQVATS PDFDGKLKD IAGEFKEQLQAL I P YVLNPSKLME KE ING 
S KVTCRGLLB YFKAY I KI YQGEDLPHPKSMIjQATAEA YNLAAAA 
SAKDIYYNNMEEVCGGEKPYLSPDILEEKHCEFKQIiALDHFKKT 
KKMGGKDFSFRYQQBLEEEIKELYENFCKHNGSKNVFSTFRTPA 
VLFTGIVALYIASGLTGFIGLEWAQLFNCMVGIiLLIALLTWGY 
I R YS GQ YRELGG AID FG AAYVLE Q AS S H I GN S TQ ATVRDAWG R 
PSMDKKAQ 


«6S4 


1 


705 
i 


RTSLSPSQCSS FNLAMASAGMQILGVVLTLLGWVNGLVSCALPM '" 
WKVTAFIGNS IWAQWWEGLWMSCWQSTGQMQCKVYDSLLAL 
PQDLQAARAL CVI ALLVALFGLLVYIiAGAKCTTCVE E KDS KARL 
VLTSGIVFVISGVLTLIPVCWTAHAVIRDPYNPLVAEAQKRELG 
ASLYLGWAASGLLLLGGGLLCCTCPSGGSQGPSHYMARYS tsap 
AISRGPSBYPTXNYV 


6655 


341 


16 


KDAYMFKKGLIALALVFSLPVFAAEHWIDVRVPEQYQQEHVQGA 
INI PLKEVKER I ATAVPDKNDTVKVYCNAGRQSGQAKE ILSEMG 
YTHVENAGGLKD I AMPKVKG 


6656 


2 


1212 


TELP PRPANLAI Q P PLS PLRALAPLPEKPGA VP. P PQ KRMAKVAKl " 
DLNPGVKKMSLGQLQSARGVACLGCKGTCSGFEPHSWRKICKSC 
KCSQEDHCLTSDLEDDRKIGRLLMDS KYSTLTARVKGGDG IRI Y 
KRNRMIMTNPIATGKDPTFDTITYEWAPPGVTQKLGLQYMELIP 
KEKQPVTGTEGAFY RIIRQLMHQLP I YDQDPSRCRGLLENELKLM 
EEFVKQYKSEALGVGEVALPGQGGLPKEEGKQQEKPEGAETTAA 
TTNGSLSDPSKEVEYVCELCKGAAPPDSPVVYSDRAGYNKQWHP 
TCFVCAKCSEPLVDLIYFWKDGAPWCGRHYCESLRPRCSGCDEI 

IFAEDYQRVEDLAWHRKHFVCEGCEQLLSGRAYIVTKGQLLCPT 
CSKSKRS 


6l>57 " 


pin 


2120 


LLTCQERAGDCLLSASTMKEWYWSPKKVADWLLENAMPBYCEP 
LEHFTGQDL INLTQEDFKKPPLCRVS SDNGQRLLDMIETLKMEH 
HLEAHKNGHANGHLNIGVDI PTPDGSFSIKIKPNGMPNGYRKEM 
IKIPMPELERSQYPMEWGKTFLAFLYALSCFVLTTVMISVVHER 
VPPKEVOP^PLPDTFFDHFNJIVOWAFSICEINGMILVGLWLIQWL 
LLKYKS I ISRRFFCI VGTLYLYRCITMYVTTLP VPGMHFNCS PK 
liFGDWEAQLRRIMKLIAGGGLS ITGSHNMCGDYL YSGHTVMLTL 
TYLFIKE YS PRRLWWYHWI CWLLS WG I FCILLAHDHYTVDVW 
AYYITTRIjFWW YHTMANQQVLKEASQMNLLARVWW YRP FQ YFE K 
NVQG I VPRfl YHW P FPWPWHLS RQVK YS RLVNDT 


665B 


35 


855 


HCCALGAPGSPYRGLYFSSAAPCTAPRKAKHQSTLEGLTKRMLM 
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ID 
NO: 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A«Alanine, C-Cysteine, D*Aspartic Acid, B= 
Glutamic Acid, Phenylalanine, G=Glycine, 
HaHistidine, I=Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
PaProline, Q=Glutaraine, R=Arginine. 
SaSerine, T« Threonine, V=»Valine, 
WaTryptophan, Y«Tyrosine, X=* Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








pdpvpvkqeamdpvsvsypsnymesmkpnkygvtystplpekff' 

QTPEGLSHGIQMBPVDLTVNKRSSPPSAGNSPSSLKFPSSHRRA 
SPGLSMPSSSPPIXKYSPPSPGVQPFGWPLSMPPVMAAALSRHQ 
I RS PGI LP VT Q P VWQPVPFM YTSHLQQ PLMVS LSEEMENS S SS 
MQVPVIBSYEKPISQKKIKIEPGIBPQRTDYYPEEMSPPLMNSV 
SPPQALLQE 


6659 


18 


523 


EPQRGDCETWPQNCSLPKFVCFFCWGFWLWRAHSMSNLHSLPGL 
RG LTS I S RNQLQ CTNAMRV I NNYQRR W KNQNTFLLAT FANWNV 
CGNPTITCPHNRTIiNNCHHSGVQVPLMYCNLTTPSPQNISNCRY 
AQTPANMFYIVACDNRDQRRDPPQYPVVPVHLHTI I 


6660 


514 


1707 


OIASLDCRHHLCEPDMKLWPSAKLLQAAAGASARACDSVTSNV 
LPLLLEQFHKHSQSSQRRTILEMLLGFLKLQQKWSYEDKDQRPL 
NGFKDQLCSLVFMALTDPSTQLQLVGIRTLTVLGAQPDLLSYED 
LELAVGHLYRLSFLKEDSQSCRVAAIjEASGTLAALYPVAFSSHL 
VPKLAEELRVGESNIjTNGDEPTOCSRHLCCLQALSAVSTHPSIV 
KETLPLLLQHLWQVNRGNMVAQS SDV I AVCQS LRQMAEKCQQD P 
ESCWYraQTAIPCLLAIiAVQASMPEKEPSVLRKVLLEDEVLAAM 
VS VIGTATTHIiSPELAAQSVTHIVPLFLDGNVSFLPENS FPSRF 
QPFQIX3SSGQRRLIALLMAFVCSLPRNVSEHIWEVLLFNIJDKVT 
PG 


6661 


179 


430 


GVHAASGTLSATWIAEAKMFD3IJUG\GKYLGQAAKLMIGM 
NYVEHMRVNHPDQTPMTYEEFFRERQDARYGGKGGARCC 


6662 


185 


423 


rslpkpapaqpasihcarfsgvtpptaktamsdg^tafnaLm^c - 

GPKADDGNI FSACAPASSAVKASVSVAQPGQAVIP 


6663 


3 


1005 

r 


rpvlssrvddfvpplpetsgrrkklermysvdrvsddi PIRTWF 

PXENIiFS FQTASTTMQAISNFR KHLRMVGSRR VKAQTFAERRER 

sfsrswsdptpmkadtshdsrdssdlqsshctldeafedldwdt 
ekgleavacdtegfvppkvmliss kvpkaeyi pti irrddpsi i 
pilydhehatfkdiiieeierklnvyhkgakiwkmlifcqggpgh 

LYLLKNKVATFAKVEKEEDMIHFWKRLSRLMS KVNPEPNVIHIM 

GCYIIiGNPNGEKLFQNIiRTLMTPYRVTFESPLELSAQGKQMIET 
YFDFRL YRLWKS RQHS KLLDFDDVL 


6664 


56 


968 


PRLLRLPRS VWMDS PWDELALAFSRTSMFP F FDI AHYLVSVMA 
VKRQPGAAAIAWKNP I SS WFTAMLH CFGGG I LS CLLLAEP PLKF 
LANHTWlLIASSIWYITFFCPHDLVSQGYSYLPVOIiLASGMKEV 
TRTWKIVGGVTHANSYYKNGWIVMIAIGWARGAGGTI ITNFERXi 
VKGDWKPEGDEWLKMSYPAKVTLLGSVI FTFQHTQHLAISKHNL 
MFLYl'IFIVATKrTMMTTOTSTMTFAPFEDTI^WMLFGWQQPFS 
SCEKKSEAKSPSNGVGSLASKPVDVASDNVKKKHTKKNE 


6665 


171 " 


1278 


DERRLACRQWTQQRSELYPGFQKRQRFLPKAGEEAAAQGGRHL 
PGRWLGPGCTQNPCSVHTATGPEPRKLPIiLPPDSPNSGYPKBPA 
AI^PGIPSPCRMTHQDLSITAKLINGGVAGLVGVTCVFPIDLAK 
TRLQNQHGKAMYKGMIDCLMKTARAEGFFGMYRGAAVNLTLVTP 
EKAI KLAANDFFRRLLMEDGMQRNIiKMEMLAG CGAGMCQ VVVTC 

PMEMLKIQLQDAGRLAVHHQGSASAPSTSRSYTTGSASTHRRPS 

BTLiT AWRTir,RTnf3T.ar2T.VD/^T j^JiTT t anmpo t rvnnT m « 
n J. u4/inciujn a w UAirtuJU X t\\JlA3Ml ±*LtH±Jltrvz} X X Y r PI/FANLNN 

LGFNELAGKAS FAHS FVSG CVAGS I AAVAVTPLD VLKTR IQTLK 
KGLGEDMYSGITDCAR 


6661 


498 


2868 


MTTFLPVPQMMAGFSFGTFGNPPP1ESPSAWQTIHQPFIVSCLTL 
WS PGCWPQP IQKEG VGLWD I R KPQS S LLRYGGNLSLQSAMS VRF 
NSNGTQLLALRRRLPPVLYDIHSRLPVFQFDNQVYFNSCTM KSC 
CFAGDRDQYILSGSDDFNLYMWRI PADPEAGGIGRWNGAFMVL 
KGHRS IVNQVRFNPHTYMICSSGVEKII KIWSPYKQPGCTGDLD 
GRIBDDSRCLYTHEEYISLVIiNSGSGLSHDYANQSVQEDPRMMA 
FFDSLVRREIEGWSSDSDSDLSESTILQLHAGVSERSGYTDSES 
SASLPR5PPPTVDESADNAFHLGPLRVTTTNTVASTPPTPTCED 
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S2Q 
ID 
NO: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
Sequence 


Ama.no acid segment containing" signal peptide 
(A=Alanine, C=Cyeteine, D-Aspartic Acid, B= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I°Isoleucine, lULysine, 
L»Leucine, M=Methionine. N=Asparagine , 
P=Prolirxe, Q=Glutamine, RaArginine, 
S=»Serine, T«Threonine, v=*Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
V - (JUU[1 / /-possiDie nucleotide deletion, 
\opossible nucleotide insertion) 








AASRQQRLSAIilU^YQDKRLLALSNESDSEENVCEVELDTDLFPR 
PRSPSPEDESSSSSSSSSSEDBEELNERRASTWQRNAMRRRQKT 
TREDKPSAP I KPTNTY I GEDNYDYPQ I KVDDLSSSPTS S PERS T 
STLE IQPSRAS PTSD IES VER KI YKAYKWLR YS YI S YSNNKDGE 
TSLVTGEADEGRAGTSHKDNPAPSSSKEACLNIAMAQRNQDLPP 
EGCS KDTPKEETPRTPSNGPGHEHS SHAWAE VP EOT S QDTGNS G 
S VEHPFBTKKLNGKALSSRAEE PPS PP VPKASG STLN SGS GNCP 
RTQSDDSEERSLETICANHNNGRLHPRPPHPHNNGQNLGELEVV 
AYS S PGHS DTDRDNS SLTGTLLHKDCCGSBMACETPNAGTRKD P 
TDTPATDSSRAVHGHSGLKRQRIBZ»3DTDSENSSSKKKI>KT 


i 6667 




1310 


ABEVBR1AAMRS DSLV PGTHTPP I RRRSKFANLGRi PKPW KWRK 
KKSEKFKHTSAALERKI S MRQSREELI KRG VLKE 1 YDKDGELS I 
SNEEDSLENGQSLSSSQLSLPALSEMBPVPMPRDPCSYEVLQPS 
DIMTCPDPGAPVKLPCLPVKLSPPLPPKKVMICMPVGGPDIiSIiV 
SYTAQKSGQQGVAQHHHTVLPSQIQHQbQYGSHGQHIiPSTTGSL 
PMHPSGCRMIDELl^IiAMTMQRIiESSEQRVPCSTSYHSSGLHS 
GDGVTKAGPMGLPEIRQVPTWIECDDNKENVPHBSDYEDSSCL 
YTREEEEEBBDEDDDSSIjYTSSLAMKVCRKDSlAIKPSNRPSKR 
ELEEKNI LPRQTDEERLELRQQIGTKL 


6668 


714 


358 


TLAVAT&PALTLRCHVCTS s snckhs wc pas srfckttntvep 

LRGNLVKKDCAE SCTPS YTLQGQVSSGTS STQCCQEDLCNEKLH 
NAAPTRTALAHSALSLGIALSLLAVILAPSIj 




459 


1207 


KDEBTRKDYDYMLDHPEBY YSHY YHYYSRRLAPKVDVRWI LVS 
VCA1SVPQFFSWWNSYNKAISYIATVPKYRIQATEIAKQQGLLK 
KAKEKGKNKKSKEEIRDBEENIIKNI IKSKID1 KGGYQKPQ ICD 
LLLFQI I LAP FHLCS YI WYCRWI YNPNI KGKEYGEEERL YI I R 
KSMKMSKSQFDSLEDHQKETFLKRELWIKENYEVYKQEQEEELK 
KKLANDPRWKRYRRWMKNEGPGRLTFVDD 


6670 


184 


594 


VAR I * GEAAKMSSEP PPPYPGGPTAPIiLEEKSGAPPTPGRSS PA 
VMQPPPGMPLPPADXGPPPYEPPGHPMPQPGFIPPHMSADGTYM • 
PPGFYPPPGPHP PMGYYP PGPYTPGPYPGPGGHTATVLVPSGAA 
TTVTV 


6671 




763 


LPAEKPRSAPNHAGGRCGPQLTALLiAAWIAAVAATAGPEEAALP 
PEQSRVQPMTASNWTI*VMEGE WMLKF YAP WCPS CQQTDSEWEAF 
AKNGEII^ISVGKVDVIQEPGLSGRFFVTTLPAFFHAKDGIFRR 
YRGPGIFEDLQNYILEKKWQSVEPLTGWKSPASIiTMSGMAGLFS 
ISGKIWHLHNYFTVTLGIPAWCSYVFFVIATLVFGLSMDLVL*V 
ISQCNWDPPYRHVS * /RPSTNLGVHTAHTSEHLRL 


6^72 


304 


1089 


APGSKP VQFMDFEGKTS FGMSVFN LSNAIMGSG ILGtAVAMAHT 

PAGKVVVAT\n:CLHNVGAMSSYLFIIKSELPLVIGTFLYMDPEG 
DWFLKGNLL1 1 IVSVLI ILPLALMKHLG YLGYTSGLSLTCMLFF 
LVSVIYKKFQLGLCYRATMKQQWESEALVGTPQPRDSTAAVKAQ 
MFH3*LTGVLTQWPI MAFAFVCHPGGAGPS ITELCRAFQAQD 


6673 


1116 


1963 


LQIQTHHTHHGARVTHLGSHQLIiANAGTMLCRQQSSSMAPAPSQ 
S VTCGP SPC VRKQES ATKCLHI GACGS DLWARGWEQG+ G * GLNV 
WLCPCVAFHRGARPQAEEGGARWWSLVSSPWIPPNP*HSSIGAE 
NAVPRP * QG * KVNPS GQERQS \ WVLPLPVPGEP LKLPGL PG *NK 
SFSRV/SGSKGKWILPRQLM*AS+R\TPRFVPGTQWVPITW/PI, 
ITWH*SAPTPPLKACPAPRE^DPCSSO*SCPCVTQKPRFSDTGW 
FG AGHCHS S CDFTRKGAAGGPG 


6674 


1 


440 


LEFDYMCQYDYVEVRDGDNRDGQI IKRVCGNERPAPIQS IGSSL 
HVL FHSDGS KWFDG FHAI YEE 1 TAGS S S PC FHDGTC VLDKAGS Y 
KCACLAGYTGQRCENLLEERNCSDPG/WPSQWVPENNRGPWAYQ 
PTPC* IGTRVAFFLT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anano acid segment containing signal peptide 
{A»Alanine,. C-Cysteine, D=Aspartic Acid, E» 
Glutamic Acid. P=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline. Q=Glutamine, R=Arginine f 
S«Serine, TeThreonine, V^Valine, 
W 3 Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 


6675 


277 


' 1678 


GNWPTERMAFLDNPT I ILAHIRQ5 HVTSDDTGMCEMVL I DHDVD 
LEKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
R RSNTAQRLERLRKE RQNQ I KCKNIQ WKERNS KQS AQELJCSLFE 
KKSLKEKPPISGKQS ILSVRLEQCPIiQLNNPPNEYSKFDGKGHV 
GTTATKKIDVYLPLHS S QDRIXPM TVVTMASARVQDL I GL ICWQ 
YTSEGREPKLKDNVSAYCLHIAEDDGEVDTDFPPLDSNEP IHKF 
GFSTLALVEKYS SPGLTSKESLFVR INAAHG FSL IQVDNTKVTM 
KEILLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMLSSH 
HYKS FKVSM1 HllLRFTTDVQL/GCAL FPGVLRKRAAPVDCLRPS 
ADTWRQEQIGCQ3AACAALRS*DSHKC*EGISGDKVEIDPVTNQ 
KASTKPWI KQKP I S IDS DLLCAC\ DLAEE 




277 


16^8 


GKWPTERMAFLDNPTI ILAHIRQSHVTSDDTGMCEMVL1DHDVD 
LB KI HP PSWPGDSGSE I QGSNGETQG YVYAQS VD ITS S WD FGI R 
RRSNTAQRLERLRKERQNQ IKCKN I Q WKERNS KQS AQE LKSLFE 
KKSLKEKPPISGKQS ILSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHSSQDRLObPMTWTMASARVQDLIGLI CWQ 
YTSEGRBPKIiNDNVSAYCLHIABDDGEVDTDFPPLDSNE PIHKF 
GFSTLALVBKYSS PGLTSKESLFVR I NAAHGFSL I Q VDNTKVTM 
KB ILLKAVKRRKGSQKVSGS RADGVFEEDSQ I D IATVQDMLSS H 
HYKSFKVSMIHRLRFT/TDVQL/GCALFPGVLRKRAAPVDCIjRPS 
ADTWRQEQIGCCGAACAALRS*DSHKC*EGISGDKVEIDPVTNQ 
KASTKFWIKQKP ISIDSDLLCAC\DLAEE 


6677 

• 


277 


1678 


GNWPTERMAFLDNPTI I LAHIRQSHVTSDDTGMCE.WLIDHDVD 
LEKIHPPSMPGDSGSEIQGSNGBTQGYVYAQSVDITSSWDFGIR 
RR SNTAQRLERLRKERQNQ I KCKN I QW KERNS KQS AQE LKSLFE 
KKSLKEKPPI SGKQSILS VRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHS S QDRLLPMTWTMAS ARVQDLIGLI CWQ 
YTSEGR£PKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNB PIHKF 
GFSTLALVEKYSSPGLTS KESLFVRINAAHGFSL IQVDNTKVTM 
KE ILLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMIiSSH 
HYKSFKVSMIHRLRFTTDVQL/GCAliFPGVLRJOlAAPVDCLRPS 
ADTWRQEQIGCCGAACAALRS *DSHKC* EGISGDKVEIDPVTNQ 
KASTKFWIKQKP I S It)S DLLCAC\DLAEE 


6678 ■ 


221 


865 


GPSNQSSGSLSLIVTGCSSVWS*INDTCTILRVLSSNFGRQ*L^ 
PPPCSQI^MSO/3CLWHLDCCCPWVPYIPGQQWRKGRQRMRW *QS 
LLGSDQESVGI^DLCVFVNFLLHVLLGLFP* PHELFLLPWDLG 
PLFP LLLQGG CHCLVLPANL VSQAPQ IGKLSCRLQTHDLEG SRN 
HHPLPLWGRWDAVKHLET VQSGLASLG FVGQHTSHGPP 


6679 


2 


786 


LE FARGAMPFLGQDWRS PGQNWVKTVDGWKRFLDEKSGS FVSDL 
SS YCN KE VYNKENLFNSLN YD/S CS QEEKEGHAE * QNQNS \ DPH 
QEKWIYVHKGSTKERHGYCTLGEAFNRLDFSTAILDSRRFNYVV 
RLLBLIAKSQLTSLSGIAQKNFWNILEKVVLKVLEDQQNITLIR 
ELLQTL YTSLCTLVKRVGKS VLVGNINMWVYRMETI LHWQQQLN 
NIQITRVSGQAQPPPGSGSLHRDTGQTRQDFEFTPVTEESGLF 


6680 


1496 


2951 


PLCTLPLMPSALPGWAGERWRKOWPLA/ PGPGTWOTPVES ISEP* 
P\RKNEPDTHCPRGEARPEV * HLPKPHS PGSEGAE IQTSA*ALP 
/NQVSPPQPM*GAEENGDQRGGKEBAGEELHRSSSGLTAAPGF? 
EVHRNLQTFPGLPS RGGG P / GGAGTQGS WAPGEQ P P/S PLLPAS 
MQRS Q AG LPG WEAGLVES PTHHI PALRPSGTNATGEAFPSTTCS 
SGP \ PAP PGPTGLRPGGGS S S GGHG * * PGLP VGKV\GALQAAQD 
PQSQGRGPTQGTVGTEMLLSGLGSAKACPAARPAVP* LPSDPAS 
TIPKKGTRGFGEGPGVLQERNRWVVGRAQGFTSADAAGTAPPGV 
* LPAPLSQPPGATEPQVRACGMAPPS PGTSGRLVAWGRHPG PQV 
AQGCPPGAGCWGSQPRGSQRCPRTYTHSPLGHGRAPCPRRCWH* 
WQDP PSS PRTGCLPGI PARQAYSAPRTRSRPG IRTGRAAYGFIR 
FQGGGGG 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A«Alanine, C«Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Hiatidine, I=Isoleucine, KoLysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutaraine, R-Arginine, 
SoSerine, T-Threonine, V«Valine, 
W=Tryptophan, Y«Tyrosine, X»Unknown # *»Stop 
Codon, /-possible nucleotide deletion, 
\»poasible nucleotide insertion) 


6661 


1169 


511 


INYIYYNQQQRAFHELK\EKLMSAPALGLPDLTKLFTLHVSER£ 
KMTVGVLTQTVG P WSRPGAYL S KQLDGVSKGWP PCPRALAATAL 
IAQEADBLTLRQNLNRKSPHA\WTLINTKGHH*LINARLTRYQ 
TLLCENPHKT 1 EVSNT/LN PATLLLVTES PVKHNCLBVLDS VYS 
SRPNLRDH P + TS VDWELYVDGSGFANPCKVTLKKETS PAPVTPR 
S 


6682 


109 


1238 


TVLCGAMQVSSLNEVKIYSLSCGKSLPEWLSDRKKRALQKKDVD 
VRRRIBLIQDFEMPTVCTTIKVSKDGQYILATGTYKPRVRCYDT 
YQLSLKFERCLDSEWTPEILSDDYSKIVPLHNDRYIEFHSQSG 
FYYKTR3 PKFGRDFSYHYPSCDLYFVGASSEVYRLNLEQGRYLN 
PLQTDAAENNVCD INS VHGLFATGT I EGRVECWDPRTRNR VGLL 
D\AP*WSQQIQR*TSLPTISALKFN\GALTMAVGTTTGQVLIjY 
DLRSDKPLLVKDHQ YGL P I KS VHFQDS LDLILSADSRI VKMWNK 
NSGKIFTSLEPEHDLNDVCLYPNSGMLLTANETPKMGIYYIPVL 
GPAPRWCSFLDNLTEELEENPESNE 


6683 


109 


1238 


TVLTOAMQVSSL^VKIYSLSOTKSLPtewI^DRiCKRALQmD^D"' 

VRRRIELIQDFEMPTVCTTIKVSKDGQYILATGTYKPRVRCYDT 

YQLSLKFERCLDSEWTFEILSDDYSKIVFLHNDRYIBFHSQSG 

FYYKTRIPKFGRDFSYHYPSCDLYFVGASSEVYRLNLEQGRYLN 

PLQTDAAENNV CDINS VHGLFATGTI EGRVECWDPRTRNRVGLL 

D\AP*TVSQQIQR*TSLPTISALKFN\GALTMAVGTTTGQVLLY 

DLRSDKPLLVKDHQ YGL P IKS VHFQDSLDLILSADSR I VKMWNK 

NSGKIFTSLEPEHDLNDVCLYPNSGMLLTANETPKMGIYYIPVL 

GPAPRWCSFLDNLTEELEENPESNE 


6684 


111 


527 


GLRGGTSRGRAGREPEFAAGVLCWAGFCQSPCPPGGRGREAPA '" 

PP\SGRRHA*RPA*WLGGPGGDSGGREEGGS/GELQRAMESKMG 

ELPLDINIQEPRWDQSTFIiGRARHFFTVTDPRNLLLSGAQLEAS 
RNIVQNYR 


6685 


2"5B 


1473 


KLLGDNFEGFCNKFELSDSENGSNS*QSPL\FDRLFDPDPQKVL 
QGVI DMKNAVI GNNKQKANLI VLGAVPRLL YLLQQETSSTELKT 
ECAVVLGSLAMGTENNVKSLLDCHI I PALLQGLLS PDLKF I EAC 
IiRCLRTIFTSPVTPEELLYTDATVIPHLMALLSRSRYTQEYICQ 
IFSHCCKGPDHQTILFNHGAVQNIAHLLTSLSYKVRMQALKCFS 
VLAFENPQVSMTLVNVLVDGELLPQI FVKMLQRDKPIEMQLTSA 
KCLT3TMCRAGA IRTDDNCI VLKTLPCLVRMCS KERLLEBRVEGA 
ETLAYLIEPDVELQRIAS ITDHLIAMLADYFKYPSS VSAITDI K 
RLDHDLKHAHELRQAAFKL YASLGANDEDIRKKVSLGBGRP PVL 
XASRQGVTST 


6686 


JIO 


927 


DS VTFDDLAVDFTPKEWTLLD PTQRNLYRDVMLENYKNLAT VG Y " 
QLFKPSLISWLEQEESRTVQRGDFQASEWKVQLKTKELALQQDV 
LGEPTSSGIQM IGSHNGG E VSDVKQCGDVSSEHS CLKTHVRTQN 
SENTFECYLYGVDFLTLHKKTSTGEQRSVFSHVWKKPSSLNPDV 
VCQKNRCTRKKKAF* LQLTLGKSFH*S IHT 


6687 


181 


915 


EAMLE AP YKKE E DEQQR KE VKKDYPSNTTS S TSNSGNETSG SS T 

IGETSNRSRDRDRYRRRNSRSRSPGRQCRHRSRSWDRRHGSESR 
SRDHRREDRVHYRSPPLATGEPimWT.Q DTCPonn»'mn?r»MOT 7>n 

IRPRDLEDFFSAVGKVRDVR 1 1 SDRNSRRSKGIAYVEFCEIQS V 
PLAIGLTGQRLLGVP 1 1 VQASQAEKNRLAAMANNLQKGNGGPMR 
LYVGSLHFNITBDMLRGIFE PFGKV 


6688 


1025 


1 


AEVPNYPRVFHKCPDSCWRFKFQPIQLQPYlLLSfSSEKPPISf - 
SEPGLPR/ SATARMATAAAPPNSS IDLPSDSGMGFISPAGDSLD 
LPSDGGTGF PS LAGDS SSTRLSSLAFI S FSLSS VS VGS SAGTTS 
STSVGSWAAFTSSSSSSTNRDVAGLDFSWITSVSGSLVPSRE 
VAVICGS KGAGASGS ASCSSRAGKTTEATAASSMPSGTSSFSTC 
TMSELEELFSLFSPAPLLSKLFTSSGSIAICCQDSGPSDTGRLS 
VCQLWLADSDTGKLS DCQEWTVGDSGGLTCPELSLGRM * M5LL 
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SEQ 

ID 

NO; 


1 Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


/unino acia segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E=> 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H»Histidine, I=»Isoleucine, K=Lysine, 
Leucine, M=Methionine, N=Asparagine, 

c-riuiiiiB, ysuiuuaiulllBj K4/uginin6 ; 

S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y^Tyroeine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 

\ nnoQH'i hi a mini An^{ /4o i r> « e» i \ 
\ H ^uaaiuj,c liUCXGOuiQc inSCrClOu/ 








SSAVIPGYSSSSDSRLNTVPTVDLLCPFQTKSST 


6689 


640 


1299 


SSSASYATSATSISDTAFSGSLKLKHGLLSALDSSSRTS*STSS 
7VEDSTFRICSPSVSDTSSDSSGSKDNVLILFSKVSI*SCFSLSS 
FFSDSISFCFSSSSFCKR*FVSSKVSQNALLSSRLSNGPGGSSK 
QRNSLTARQLAMS L* ATKF *RNACNPNCLS S KKS AL * LS LNQRF 
GGSASRKPGNISFN3QKCSALSYCCNFVIKPREVSVSSKNYPAF 


\ 6690 


1 


442 


GTRGKMAATLGPLGSWQQWRRCLSARDGSRMLLLLLLLGSGQGP 
QQVGAGQTFEYLKREHSLSKPYQGVGTGSSSIiWNLMGNAM VMTQ 
Yl RLTPDMQSKQGALWNRVPCFLRBVTBI^JVHFKIHGC^KKinjN^ 
GDGLAIWYTKDRMQP 


6691 


287 


1401 


LKTETSEBKARRYKDRPSQLNAVFQEQKKMIQAQESITLEDVAV 
DFTWEBWQLLGAAQKDL YRDVMLBNYSNLVAVGYQAS KPDALFK 
LBQGEQLWTIEDGIKSGACSDIVJKVDHVLBRLOSESLVNRRKPC 
HEHDAFENI VHCS KSQFLLGQNMD I F DLRG KS LKSN LTLVNQSK 
GYEIKNSVEPTGNGDSFLHANHERLHTAIKFPASQKLISTKSQF 
ISPKHQKTRKLEKHHVCSSCGKAFIKKSWLTDHQVMHTGEKPHR 
CSLCEKAFSRKFMLTEHQRTHTGEKPYECPECGKAFLKKSRLNI 
HQKTHTGEKPYIC5ECGKGFIQKGNLIVHQRIHTGEKPYICNBC 
/GKGFIQKTCLIAIiQRFHTER 


6692 


178 


939 


WIKEGELSIiWERFCANI IKAGPMPKHIAFIMDGNRRYAKKCQVE 
RQEGHSOXjFNKIiAETIiRWCIiNLGILEVTVYAFSIENFKRS ksev 
DGLMDIi^QKFSRl^IBE KEKIjQKHGVCIR VLGDLHLLPIiDLQEL 

iaqavqatknynkcflnvcfaytsrheisnavremawgveqgll 
dpsdi seslldkciiytnrs phpdilirtsgevrlsdfllwqtsh 
scxvfqpvlwpeytfwnlfeailqfqmnhsvlqk 


6693 


178 


939 


WIKBGELSLWERFCANI IKAGPMPKHIAFIMDGNRRYAKKCQVE "' 
RQEGHSQGFNKLAETIiRWCLNLGILBVTVYAFSIENFKRSKSEV 
DGLMD IARQKFS RLMEEKEKLQKHGVCI RVLGDliHLliPLD LQEL 
IAQAVQATKNYNKCFLNVCFAYTS RH EI SNAVREMAWGVEQGLL 
DPSDISESLLDKCLYTNRSPHPDILIRTSGEVRLSDFLLWQTSH 
SCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQK r 


6694 


292 


813 


SLLLHLAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHEUPVR 
EVHSIiGQILPQDGLTAEAGPPEAQDPWGSPGISLPAAHIGFAAA 
IiAVGPSGCHTEP\ FDEVWPSLFLGDAYAARDKSKLIQLG ITHW 
MAAAGKFQVDTGAKFYRGMSLEYYG IEADDNPFFDLS VYFI»P 


6695 


292 


813 


SLLLHIiAPPGAYTPSQPIiSSVSTETASSVRRQAAESRQHEXjPVR 
EVHSLGQILPQDGLTAEAGPPEAQDPWGSPGISLPAAHIGFAAA 
LAVGPS G CHTBP\ FDEVWPSLFLGDAYAARDKS KLI QLG ITHW 
NAAAGKFQVDTGAKFYHGMSLEYYGIEADDNPFFDLSVYFLP 


6 696" 


1 


782 


PRVRGRVGERWAFLS VPAAMSS EMEPLLLAWS YFRRRJCFQLCAD " 
LCTQMLEKS P YDQAAW I LKARALTEMVYTDE I DVDQEG IAE MMIi 
BoNAIAyVPRPGTSIjKIiPGTNu/rGGPSQAVR^ 
RPSTQSGRPGTMEQAIRTPRTAYTARPITSSSGRFVRLGTASML 
TSPDGPFINLSRLNLTKYSQKPKLAKALIEYI FHHENDVKTALD 
LAALSTEHSQ YKD WW WK/DQ I EKCYYR VGM YRE AE KQ IKS S 


6-6-97 


3 


782 


PPLPLRRLNSRALRPGSRKVMAWPASIiSGQDVGS FAYLTI KDR 
IPQILTKVIDTUJRHKSEFFEKHGEEGVEAEKKAISLLSKLRNE 
LQTDKPFIPLVBKFVI)TDIWlfQYLEYQQSLLNESDGKSRWFYSP 
MLLV\ECYMYRRIHEAI\IQSPPIDYFDVFKESKEQNFYGSQES 
IIALCTHLQQLIRTIEDLD\ENQLKDEFFKLLQISLWGEISVDL 
SL\SGGES SSQNTNVLNSLEDLKPFILLNDMEHLWSLLSNCK 


6698 


668 


754 


VGSCACAGSCKCKECKCTSCKKSECRAFP 


6699 


325 


492 


EGELP/ PARR VLPRAMTASAQPRGRRPGVGVGWVTSCKH PRC V 
LLGKRKGS VGAGS FQLPGGHLEFGETWEECAQRETWEEAALHLK 
NVHFASVVNSFIEKENYHYVTILMKGEVDVTHDSEPKNVEPEKN 
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! SEQ 
ID 
NO: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A»Alanine, OCyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenylalanine, G=Glycine„ 
H=Histidine, I«Isoleucine, K=Lysine, 
Ii=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ESKRI IYNHAPFFQBSKWSGGILQ 


6700 


1096 


1392 


TQCWRS STPGMRTHFRTQ P / RLECGQGFSQQENGHCMDTN ECIQ 
FPFVCPRDKPVCVNTYGSYRCRTNKKCSRGYEPNEDGTACVERT 

lllglcnllgk 


6701 


2 


1485 


AAAGPRTRVRRAAAFEGQPSPSPGLGPTSDKAAAPRTPKRRRLW 

RORfi /HPfiMTiCYVn) DHAVT JvTP^/E'\/c T a VAKincnr'T MmrnoBT r> t 
nrniTiiv.! v jl XviriJ/\vjuriOViSVXUAJ^/WulSU^ijNyVL.KiCL»GX 

IKVD Y FGLQFTG S KGES L WLNLRNR I S QQMDGLAP YRIjKLRVKF 
FVEPHL I LQEQTRHI FFLH Z KEALLAGHLLCS P EQ A VE L S ALLA 
QTKFGDYNQNTAKYNYEELCAKELSS ATLNS I VAKHKELEGTSQ 
AS AEYQ VLQIVS AMEN YG I EWHS VRDS EG QKLL I GVGPEG I S I C 
KDDFS P INRIAYPWQMATQSGKNVYLTVTKES GNS I VLL PKMI 
STRAASGLYRAITETHAPYRCDTVTSAVMMQYSRDLKGHLASLF 
LNENINLGKKYVPDI KRTSKEVYDHARRALYNAGVVDLVS RNNQ 
SPSHSPLKSSESSMNCSSCEGLSCQQTRVLQEKLRKLKEAMIiCM 
VCCEBE INST PC P OGHTVCCES CAAQLQVGESAAHFCLQPHLS L 
LLTGS RSQVLAR 


6702 


397 


1971 


PIAKPLKIJ3LVNVI,CLPMEDVPLPYR^Ct , CSMGLGSSCHI*SLPK 
RAEALLCSRKATVVRDLVAVRjMAEEQEFTQLCKLPAQPSHPHCV 
NNTYRS AQH SQ ALL RGL LALRDS G ILFD WLWE G RH I EAHR I L 
LAASCDYPKGMFAGGLKEMEQEEVIjIHGVSYNAMCQItiHFIYTS 
ELELSL3 NVQETLVAACQLQ IPEII HFCCDFLMS WVD EEN I LDV 
YRLAELFDLSRLTEQLDTYILKNFVAFSRTDKYRQLPLEKVYSL 
LSSNRLEVSCETEVYEGALLYHYSLEQVQADQISLHEPPKLLET 
VRFPLMEAEVLQRLHDKLDPS PLRDTVASALMYHRNESLQPSLQ 
SPQTELRSDFQCWGPGGIHSTPS\MSSATRPKYLNPLLGEWKH 
PTAS LAP RMS NQG IAVLNNFVYL IGGDNNVQGFRAE SRCWR YD P 
RHNRWFOI QSLQQEHADLS VCVVGR YI YAVAGRD YHNDLNAVER 
i u t»A l£ict WAX V AF juKRE VYiUiAGATLEGKMYITCGRKGRIT 


6703 


45 


1244 


GVGPRAAAMPLELELCPGRWVGGQHPCF 1 1 AEI GONHQGDLD VA ' 

KRMIRMAKECGADCAKFQKSELEFKFNRKALERPYTSKHSWGKT 

YGEHKRHLEFSHDQYRELQRYAEBVGI FFTASGMDEMAVE PLHB 

LNVPFFKVGSGDTNNPPYLEKTAK/TRGWHSVLRDVCGVQLNDE 

TSSWDV1^RVRTSKEKVIWVI»VI^YSGRPMVISSGMQSMDTMKQ 

VYQIVKPLNPNFCPLQCTSAYPLQPEDVNLRVISEYQKLFP0IP 

I G YS GHETG IAIS VAAVALGAKVLE RH I TLDKTWKGSDHS AS LE 

PUrAJLinBJUVXva VIUIVJSKAliudlr JL ZvVw""*"^NACT^KLGKSVVAJCV 

KIPEGTILTMDMLTVKVGEPKGYPPEa)IFNLVGKK\nLiVTVEEDD 
TIMBE 


6704 


82 


1007 


TMNTRNRWNSGLGASPASRPTRDPQDPSGRQGELSPVEDQREG ' 
LEAAPKGP S RE S VVHAGQRRTS AYTL IAPN INR RNB I Q R I AEQE 

LANLEKWKEQNRAKPVHLVPRRLGGSQSETEVRQKQQIiQLMQSK 
YKOKLKREES VRT KKEAEEAELOIfM ICA.T f>P V VQMTfT. P P irtror 

NLRREAFREHQQYKTAEFL/RQTEHRIARQKCLSKCCLWPTILN 
MGQKLGLQ\DSLKAEENRKLQKMKDEQHQKSELLELKRQQQEQE 
RAKIHQTEHRRVNl^FLDRIxQGKSQPGGLEOSGGCWNMIJSGNSW 
GI 


6705 


2 


786 


RLCRNSARVP CGWS ASRS LGEGAG F IGPLRGPKP RAGGTGTS FT 
SYKRKGGIMSTIAAFYGGKSILITVATGPLGKELMEKLFRTSPD 
LKVIYILVRPKAGQTLQHRVFQILDSKLPEKVIEVRPNVHEKIR 
AI YADLNQNDFAIS KE DMQBLLS CTNI I FHCAATVRPDDTLRHA 
VQLNVTATRQLLLMASQMPKLEAFIHISTAYSNCNLKHIDEVIY 
PC PVE PKKI IDS LEW \ LDDAI I DE I T PKL I RDWPNI YT YTK 


6706 


130 


521 


PTHSSSSHSQEMI/3KIiNMIiRNIX3HFCDITIRVQDKIFPJUnCVVL 
AACS DFFRTKLVGQAEDENKNVLDLHHVTVTGFI PLLE YAYTAT 
LS INTENI IDVLAAASYMQMFSVASTCSEPMKSSILWNTPNSQP 
EK 



542 



WO 01/53312 



PCI7US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 

auiJLIiD dClQ 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AwAlanine, C-Cysteine, D-Aspartic Acid, E» 
Glutamic Acid, F-Phcnylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W»Tryptophan, YoTyrosine, XaUnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6707 


2233 


1343 


YWSGIGYELQHFHWRKFKFEKKGPPSTCQBRLYESRSRWPCIS* 
GMVWGWTAVNGSW* GGQLRCVCVCTSHSSDSTRSSQRAS KCHS 
FFI LSQ * KT * S S WENW VFAKYS R I YS YGHS CS KGRGD * DFK*NV 
SQAR * SR FCGLCNP CGHCGLD INLRGGS SPWTD KHSCVHNNLLC 
NRRVFSLIiCEGPGHCYQGAVCRBACAAASPGLDS AAE PHRLCEM 
TD*LPK*GPGYIQHFHCDSN1LCILYNISFNLFSYSF*GVARYA 
C*RCHWYFEWLLYNHCGD ILVACL* RRQL* SSQ 


6708 


115 . 


1729 


TVGSWSRSGRSPPVGRQLLLTGRGAQAAGSPQGGMALQVELVPT 
GEI IRWHPHRPCKIALGSDGVRVTMESALTARDRVGVQDFVLL 
ENFTSEAAF2 ENLRRRFRBNLI YTYIGPVLVSVNP YRDT .Q I YSR 
QHMERYRGVS FYEEP PHLLA VADTVYRALRT ERRD Q A VM I S VE S 
GAGKTDATKRLLQLYAETCPAPQRGGAVRDRLLQSNPVLEAFGN 
AKTLRNDNS S RFGKYMD VQ FD FKGAP VGGK I LS YLLEKSRWHQ 
NHGERNFHI FYQUiBGGEEETLRRLGLERNPQS Y1.YLVKGQCAK 
VSSINDKSDWKVVRKALTVIDFTEDEVEDLLSIAASVLHLGNIH 
FAANEESNAQVTTEN'QLK YLTRLLS VEGSTLREALTHR KI IAKG 
B E LL S PLNL EQAAYARD ALAKAVY S RTFTWL VG K I NRS LAS KD V 
ESPSWRSTTVLGLLDIYGFEVFQHNSFEQFCINYCNEKLQQLFI 
ELTLKSEQEBYEAEG IAWEPVQYFNNKI ICDLVEE KFKGI I \SI 
LDE\ECLRPGE 


6709 


3 


894 


PPHEHLFPSGERGPFSFLVSRRGLGPGKMGKKGKKEKKGRGAEK 
TAAKMEKKVSKRSRKBEEDLEALIAHFQTLDAKRTQTVELPCPP 
PSPRIiNASLSVHPEKDELILFGGEYFNGQKTFLYNELYVYNIRK 
DTWTKVDI PSPP PRRCAHQAVWPQGGGQLWVFGGEFAS PNGEQ 
FYHYKDLWVLHLATKTWBQVKSTGGPSGRSGHRMVAWKRQliILF 
GGFHESTRDYIYYNDVYAFNUDTFTWSKLSPSGTGPTPRSGOQ\ 
I PSLPRAAS SVYGGYS KQRVKKDVDKGTRHSDM F 


" £710 


158 


980 


RHKMTNYRV^SSGRAARKMRIALMGPAPIAAIGYIDPCNFATN 
IQAGAS FG YQLLWVWWANLMAMLI QILSAKLG IATGKNLAEQI 
RDHYPRPWWFYWVQAEI IAMATDLAEFIGAAIGFKLILGVSLL 
QGAVI*TGIATFliILMliQRRGQKPIiEKVIGGLLIiFVAAAYIVELI 
FSQPNLAQLGKGMVIPSLPTSEAVFIiAAGVL \GATIMPHVI /YI 
WHSS LTQHLHGGSRQQRYSATKWDVAI AMTIAGFVN LA I MATAA 
SELNFYGHTGVA 


' £711 


3 


347 


VTECKTMTCKMSQLERN I *TMINTLHHYSVKLGHPDTI* IHGEFK 
ELVRTDLHN I LM KENKNDQAI *H I MEDLDTNAHMQ I I FKEL IML 
MAMLTWSYHDNMHDADYGPGQQHRPG 


6712 


118 


578 


PHGQKRTRYPQVRAPGQQPQAQLAMALCIiKQVFAKDKTFRPRKR 
FEPGTQRFELYIQCAQASLKSGLDLRSVVRLPPGENIDDWIAVHV 
VDF FNR INL I YGTMAERCS *TS CP VMAGGPRYE YRWQDERQ YRR 
PAKLSAPRYMALLMDWIESLI 


6713 


2485 


3 


QARGSDSEDGBFEIQAEDDARARKLGPGRPLFrFPTSECTSDVE 
PDTREMVRAQNKKKKKSGGFQSMGLS YP VFKG IMKKGYKVPTP I 
QRKTI PVILDGKDWAMARTGSGKTACFLLPMFERLFCTHS AQTG 
ARALILSPTRELALQTLKFTKELGKFTGLKTALILGGDRMEDQF 

MGFAEQLQEI IARr,PGGHQTVLFSATLPKLL.VEFARAGLTEPVL 
IRLDVDTKLNEQLKTS FFLVREDTKAAVLIJttJ>HNVVRPQDQTV 
VFVATKHHAE YLTEU^TTQRVS CAHI YSALDPTARKINLAKFTL 
GKCSTLI VTDLAARGLDI PLLDNVINYS FPAKGKLFLHRVGRVA 
RAGRSGTAYSLVAPDEIPYLLDUUjFLGRSLTLARPIjKEPSGVA 
GVDGMLGRVPQSWDEEDSGLQSTLEASLELRGLARVADNAQQQ 
YVRSR PAPSPES I KRAKEMDLVGLGLHPl/FSSRFEEEELQRIiRL 
VDS I KNYRSRATI FE INASSRDLCSQ VMRAKRQKDR KAI ARFQQ 
GQQGRQEQQEGPVGPAPSRPALQEKQPEKEEEEEAGESVBDIFS 
E WGRKRQRSG PNRGAKRRREEARQRDQEFYT P YRPKDFDS ERG 
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SEQ 
ID 
NO: 


" Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ' 
(A=Alanine, C« Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine . G=Glycine, 
H*Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
WsTryptophan, Y=Tyrosine, X-Unknown, +-Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








LS ISGEGGAFEQQAAGAVI^LMGDEAQNIiTRGRQQIjKWDRKKKR" 
FVGQSGQED10CKIKTESGRYISSSYKRDLYQKWKQKQKID*S+L 
GRRRG I LTRRR PR TEE VGEAR PLAQAG C I FGPHAFRHPL QAES A 
LBLKTKQQIIiKQRRRAQKAALSLQRWWPQAALCPQ 


67X4 


169 


1416" 


NNCQELLPPPPAPMAHI PSGGAPAAGAAPKGPQYCVCKVELSVS 
GQNLLDRDVTSKSDPFCVLFTENNGRWIEYDRTETAINNLNPAF 
S KKFVLD YHFEE VQKLKFALFDQD KS S MRLDEHDFLGQFSCSIjG 
TIVSSKKITRPIiLLIiNDKPAGKGLirrAAOBLSDNRVITIiSLAG 
RRLDKKDLFGKSDPFLEFYKPGDDGKWMLVHRTEVIKYTLDPVW 
KPFTVPLVSLCDGDP4EKPIQVMCYDYDNDGGHDFIGEFQTSVSQ 
MCEARDSVPLEFECINPKXQRKKKNYKNSGIIILRSCKINRDYS 
FLDYrLGGCQLMFTVGIDFTASNGNPLDPSSLHYINPMGTNEYL 
SAIWAVQQ1IQDYDSDKMFPALGFGAQLPPDWKVSHEFAINFNP 
TNPFCSGVDG IAQAYSACLP 


6715 


32 


493 


GPAGAESGSIjHCLPATVQALAGAAHSPHGGQPPRRGPLIGSGMP 
GKPKHLGVPNGRMVLAVSDGELSSTTGPQGQGEGRGSSLS IHSL 
PSGPSSPFPTEEQPVASWALSFERLLQDPLGLAYFTEFLKKEFS 
AENVTFWKACERFQQI PASDT 


6716 " 


1 


176 


GAGGPAPRSFGSEEPRAALERDKMSARAAAAKSTAMEETAIWEQ 
HTVTmRVSLCCSK 


6717 


115 


896 


LFAMSGFENLNTDFYQTSYSIDDQSQQSYDYGGSGGFYSKQyAG 
YDYS QQGRFVP PDMMQPQQPYTGQ I YQPTQAYTPAS PQPFYGNN 
FEDEPPLLEELGINPTJHIWQKTLTVLHPLKVADGSIMNETDLAG 
PMVF CLAFGATLLLAGKIQFG YVYG I SAIGCLGMFCLLNLMSMT 
G VS FGCVAS VLG YCLLPM I LI*SS FAVI FSLQGMVG I ILTAG I IG 
WCSFSASKIFISALAMEGQQLLVAYPCALLYGVFALISVP 


6718 


290 


599 ! 


KQSS TVPGTILP S LKWHNSGIjCKFPETGGKMTT FKEGLTFKDVA 
VI FTEEE LGLLD P VQRNLYQD VMLEN FRNLLS VGHHP F KH D VF L 
LB KE KKLD IMKTATQ 


6719 


1 


691 

*- 


PTRPEEQDREDGKCHKMBMNPISGNLNCDPIAMSQCSSDHGCET 
DLDSDDDKIEKPNNFMKDSASQDNGLSRKISRKRVCSSDSDSSL 
Q WKKS S KARTGLLR I TRR CAATAAN KI KLMS D VE D VS LENVHT 
RSKNGRKKPLHLACTTAKKKL3DCEGSVHCEVPSEQYACEGKPP 
OPDS EGSTKVLS QALNG DS DS EDMLNS EHKHRHTNIHKI DAPS K 
RKSSSVTSSG 


6720 


3 


822 


HE VAEEAGGTVYPQRGTMPGTKRFQHV I EOTE" PGKWeI/TGYEAA " 
VPITEKSNPLTQDliDKADAENIVRLLGQCDAEI FQEEGQALSTY 
QRLYSES I LTTMVQVAGKVQE VLKEPDGGtiWLSGGGTSGRMAF 
LMSVS FNQLM KGLGQKPL YT Y L I AGGDRS WAS RE GTEDSALHG 
I EEL KKVAAGKKR VI VIG I S VGLS AP FVAGQMDCCMNNTAVFLP 
VLVG FNP VSMARH PFPPP R I LRSLTVFP S LRAPH YQ I TSLI*FSM 
SWTLISE 


6721 


3 


822 


HBVAEEAGGTV Y PQRGTMPGT KRFQHV I ETPE PG KWELTGYEAA ' 
VPITEKSNPLTQDLDKADAENIVRLLGQCDAEIFQEEGQALSTY 
QRLYS ES ILTTMVQVAGKVQE VLKEPDGGLWLSGGGTSGRMAF 
LMSVS FNQLMKGLGQKPIjYTYLIAGGDRSVVASREGTEDSALHG 
I EELKKVAAGKKRVI VIG I S VGLS AP FVAGQMDCCMNNTAVFLP 
VLVGFNPVSNlARHPFPPPRILRSLTVFPSIiRAPHYQITSLLFSM 
SWTLISE 


6722 


1 


390 


RSWSKRTWQALPMAVLFLLLFLCGTPQAADNMQAIYVALGEAVE 
LPCPSPSTLHGDEHLSWFCSPAAGSFTTLVAQVQVGRPAPDPGK 
PGRESRLRLLGNYS L WLEGSKEEDAGRYWCAVLGQHHNYQNW 


6723 " 


173 


659 


VCQYCTARMAD FG I S AGQFVAVVWDKSS P VEALKGLVDKLQALT 
GNEGRVS VENI KQLLQS AHKESSFDI I LS GLVPGSTTLHSAEIL 
AEIAR ILRPGGCLFLKEPVETAVDNNS KVKTASKLCSALTLSGL 
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j SEQ 
ID 
NO: 


| Predicted 

beginning 
I nucleotide 

location 

corresponding 
J to first 

amino acid 

residue of 
| amino acid 
j sequence 


Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, e= 
Glutamic Acid, P= Phenylalanine, G=Glycin e , 
H=Histidine, I=Isoleucine, FUL/Bine, 
L=Leucine, M=Methionine. N»Asparagine, 
Po Proline, Q=Glutamine, R=»Arginine, 
SaSerine, T=Threonine, V«Valine, 
W-Tryptophan, Y«Tyrosine, X«Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 
VEVKELQREPLTPBEVQSVREHLGHESDNL ' 


6724 


173 


659 


VCQ Y CTAKMAD FG I S AGQFVAWWDKSS PVEALKGLVDKLQALT 
GNEGRVS VENI KQLLQS AHKES S FD I ILS GL VPGS TTLHS AE I L 
AE IARILRPGGCLPLKEPVETAVDNNSKVKTASKLCSALTLSGL 
VEVKELQRE PLTPEEVQS VRBHLGHESDNL 


6725 


1 356 


722 


RRRTPPVIIjATMDDDLMIiALRIjQEEWNI^EAERDHAQESIjSLVD 
ASWELVDPTPDLQALKVQFNDQPFWGQLEAVEVKWSVRMTLCAG 
I CS YBGKGGMCS IRLSEPLLKLRPRKDLVEVFFV 


f 6726 


98 


714 


HLOKMKRl<INRkEKEKEYEGKHNSLEDTD(^-KNUK^TLMT£j^G 
G YLYI TQKQTLTKYPDTFIiEG 1 VNGKI LCPFDADGHY F I DRDGL 
LFRHVLNPLRNGELLIiPEGFREKQIjIjAQEAE FFQLKGLAEEVKS 
RWEKEOjLTPRETTFIiEITDNHDRSQGIiRIPCNAPDFISKI KSRI 
VLVSKSRLDGFPEEFSISSNIIQFKYFIK 


6727 


j 1 


831 


FRGMGDSRPHYYGKHGTPQKYDPTFKGPI YtJRgCTDl 1 CCVFLL 
LAI VGYVAVO I IAWTHGDPRKVI YPTDSRGE FCGQKGTKNENKP 
YLFYFNIVKCASPLVLLEFQCPTPQICVEKCPDRYLTYLNARSS 
RDFE YYKQFCVPGFKNNKGVAEVLRDGDCPAVLI PSKPIjARRCF 

paihaykgvlmvgnettyedgbgsrknitdlvegakkangvlea 

RQLAMRIFBDYTVSWYWDI ISLGIAMAMSLLFI ILLRFLAG Imq 

rgmiimgxlvlgy 


672B 


486 


93S 


fcsswlrsladsslswkmflvgltggiasgk^^vIqvfqqlgca" — 
vidvdvmarrwqpgypakrrivevfgtevllemgdinrkvlgx) 
li fnqptorqllkaithpeirkemmketfkyflreprts prgkk 
hvpsalkeadslmrrdt 


6729 
f 6730 | 


2*9 


1191 

r 


VGLTQAQSGRTASMGRDQRAVAGPAIiRRWLLLGTVTVGFLAQSV 
LAGVKKFDVPCGGRDCSGG CQCYPBKGGRGQ PG P VG PQG YNGP P 

gi^fpglqgrkgdkgergapgvtgpkgdvgargvsgfpgadg I 

PGH PGQGGPRGR PG YDGCNGTQGDSGPOGP PGS EGFTGP PGPQG 
PKGQKGEPYALPKEERDRYRGEPGBPGLVGFQGPPGRPGHVGQM 
GP VGAPGRPGP PGPPGPXG QQGiJRGLGFYGVKGBKGD VGQ PGPN 
GXPSDTLHP I IAPTGVTFH PDQYlCGE KG 3 E GE PG IRG I S L KGE E 
GIM 




784 


1015 


NMVDYYEVLGLQRYAS PED I KKAYHKVALKWH PD KNP EN K E EAE 
RKFKEVAEAYEVLSNDEKRDI ydkygteglne f 


j 6731 


1 


446 


GIRKRIiHGAVVPRVEVGCPWETJ^SEGVH^^ 

LDIYAGLDSAVSDSASKSCVPSRNCLDLYEEILTEEGTAiCEATY 

NDI^VEYGKCQLQKKELMiCKFKEI0/I\5NFSLINENQSLKKNISA 
LI KTARVE INRKDE E I 


6732 


102 


1205 


GRWQRRPPPPSPPLWCLQPGGGSDPQQLTQLRHCLSHSPO^fpir" 
AQRQVCYTAATTQAAAPATRNCLPDHSGHRPTPPRSHRHHRQEN 
LGSIKPSSRSTKATSTTMAGDGRRAEAVREGWGVYVTPRAPIRE 
GRGRLAPQNGGSSDAPAYRTPPSRQGRREVRFSDEPPBVYGDFE 
PLVAKERSPVGKRTRLEEFRSDSAKEEVRESAYYLRSRQRRQPR 
PQ BTEEMKTRRTTRLQQQHSEQ PPLQPS P VMTRRGLRDSHS SEE 
DEASSQTDLSQTISKKTVRSIQEAPAVSEDLVIRLRRPPLRYPR 
i iw*i o v w «. v w xtj cbGJSTBEDDQDSSHSS VTT VKARS RDSDESG 
DKTTRSSSQYIES FW 


6733 


613 


1311 


RSCRQVGMRSRWQGGESASDGHISCPKPSI IGNAGEKSLSEDAK - 
KKKKSNRKEDDVMASGTVKRHLKTSGECERKTKKSLELSKEDLl 
QLIiSIMEGEl^AREDVIHMLKTEKTKPBVLEAHYGSAEPEKVLR 
VLHRDA I LAQEKS IGEDVYEKP I S ELDRLE EKQKETYRRMLEQk 

LLAEKCHRRTVYEIiENEKHKHTDYT^NKSDDFTNLLEQERERIiKK 
LLEQEKAYQARKE 


6734 


189 


551 


SAAMFPVFSGCFQELQEKNKSLELVSFEBVAVHFTWEEWQDLDD 
&QRTLYRD VMLET YS S L VSLGHC I TKPEM I FKLEQGAE P W I VBE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending" 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cyateine, D^Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, ToThreonine, V«Valine, 
W^Tryptophan, Y-Tyrosine, X-Unknown , *»Stop 
Codon, ./-possible nucleotide deletion, 
\«possible nucleotide insertion) 
TLNLRLSGQ S KKQVFSG I CHRS LVELQBVHLV 


6735 


280 


558 


KSRRAGVTKMSNPFLKQVFNKDKTFRPKJ?KFEPGTQRPELHKk5~ 
QAS LNAGLD LRLAVQ L P PGE DLNDWVAVHWD FFNR VNL IY GT I 
XDGCT 


6736 


195 


808 


MNYELNFKREMPNIKSLGLTNLNFLLKRLSSVLPLITDYVYFEN 
SSSNPYLIRRIEELNKTASGNVEAKWCPYRRRDISNTLIMLAD 
KHAKEIEBESETTVEADIjTDKQKHQLKHRELFLSRQYEStiPATH 
IRGKCSVALLNETESVLSYLDKEDTFFYSLVYDPSLKTLLADKG 
EIRVGPRYQADIPEMLLEGTFFCVFAVL 


6737 


150 


1209 


PVIMPLHFS PGDI VRPSCCVSSS PKLRRNAHSRIiES YRPDTDLS" ' 
REDTGCNLQHI SDRBNIDDLNMEFNPSDHPRASTI FLS KSQTDV 
REKRKSLFINHHPPGQIARKYSSCSTIFLDDSTVSQPNLKYTIK 
CVALAI YYH I KNRD PDGRMLLD I FDENLH PLSKSE VP PDYDKHN 
PEQ KQI YRFVRTLFSAAQLTAECAI VTLVYLERLLTYA E ID I CP 
ANVI KR I VLG A I LLAS KVWDDQA VWNVD YCQ I LKDITVEDMNELB 
RQFLELLQFNINVPSSVYAKYYFDLRSIiARANNLSFPLEPLSRE 
RAHKLEAISRI*CEDKYKDLRRSARKRSASADNLTLPRWSPAI IS 


6738 


148 


653 


CACAEQPARAE VG AATALPVRWASGEMAPSGS LAVPLAVL VLLL ' 
WGAP WTHGRRSNVR VI TDENWRBLLEGDWM I EFYAPWCPACQNL 
QPEWESFAEWGEDLEVNIAKVDVTEQPGLSGRFIITALPTIYHC 
KDGEFRRYQGPRTKKDFIHFISDKEWKSIEPVSSWF 


6739 


3 


631 


SWPDMAEEEVAKLEKHI.MLLRQEYVKIiQI^^ETEKRCAI.O^Q 
ANKESSSESFISRUAIVADLYEQEQYSDLKIKVGDRHISAHKF 
VIJUU^DSWSIiANLSSTKBLDI^DANPE^TMTMLRWIYTDELEF 
REDDVFLTELMKLANRFQLQLLRERCEKGVMS LVNVRNCIRFYQ 
TAEELNASTIiMNYCAEI IASHWVSEVEGVNKAL 


6740 


3 


631 


S WPDMAEEBVAKLE KHLMLLRQE YVKLQKKLAETEKRCALIiAAQ 
ANKESSS ESFISRLLAI VADLYEOEOYSDliKIKVEDRH T <3 A win? 
VXiAARSDSWSLANLSSTKELDI^DANPETVTM'IWLRWIYTDELEF 
REDDVFL'I^LMKIJVNRFQI^LLRERCEKGVMSLVNVRNCIRFYQ 
-TAEELNASTLMNYCAE I IASHWVSEVEGVNKAL 




141 


960 • 


PLTLP FS SRARAGHTMNTS PGT VGSD PVILATAG YDHTVRFWQA 
HSG ICTRTVQHQDS QVNALEVTPDRS MIAAAVQP VS LGYQHI RM 
YDLNSNNPNP1 IS YDGVNKNIASVGFHEDGRWMYTGGEDCTARI 
WDLRSRNLQCQR I FQ VN AP I NCVCLHPNQAEL I VGDQSGA I H rW 
DLKTDHNEQLIPEPEVS ITS AHI D PDAS YMAAVNSTLVPFSCLL 
PLAIGI LQEGEFESLARRGLLFIACQGNCYVWNLTGGIGDEVTQ 
LIPKTKIP 


6742 


141 


960 


PLTLPFSSRARAGHTMNTS PGTVGSDPVILATAG YDHTVRFWQA - 
HSGI CTRTVQHQDS Q VNALE VTPDRSMI AAAVQP VS LG YQH I RM 
YDLNSNNPNPI IS YDGVNKNIAS VGFHEDGRWMYTGGEDCTARI 
WDLRSRNLQCQRIFQVNAPINCVCLHPNQAELIVGDQSGAIHIW 
DliKTDHNEQLlPEPEVSITSAHIDPDAS YMAAVNSTLVPFSCLL 
PLA I G Z LQEGE FESLARRGLL FLACQGNCYVWNLTGG IGDE VTQ 
LIPKTKIP 


6743 


1 "™ 


412 


MHSTQDKSLHLEGDPNPSAAPTSTCAPRKMPKRISISKQIiASVK 
ALRKCSDLEKAIATTALIFRNSSDSDGKLBKAIAKDLLQTQFRN 
FAEGQETKPKYRE ILSELDEHTENKLDFEDFMILLLS ITVMSDL 
LQNIR * 


6744 


95 


1343 ■ 


RTPARNRCAUCEVLS RFSS PNKAS SFALQSAGGGLPAVRALRRD 
RQKVSTVG YGMDEVEQDQHBARLKELFDS FDTTGTGSLGQEELT 
DLCHMLSLEEVAPVLQO/ETjLQDNLLGRVHFDQFKEALI LILSRT 

lsneehfqe pdcs l e aq pkyvrgg kr ygrrs lpefqes veefpe 
vtviepldeearpshipagdcsehwktqrseeyeaegqlrfwnp 
ddlnasqsgssppqdw i eeklqevcedlgitrdghlnrkklvs i 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre aponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H=Histidine, Ioisoleucine, K=I>ysine, * 
LaLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
SaSerine, T=Threonine, V= Valine, 
WoTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








CEQYGLQNVDGBMIiEEVFHNLDPDGTMSVEDFFYGLPKNGKSLT " 
PSASTPYRQLKRHLSMQSFDESGRRTTTSSAMTSTIGFRVFSCL 
DIXjMGHASVERILDTWQEEGIEWSQBILKALDPGLDGNINLTEL 
TLALENELLVTKNS IHQACI 


6745 


1 


5B8 


TFRDQGWAQRRRWLLGCASWESWBAAIAAGPGLPSSTARQQNNP " 
AAGTECFAAVWARGTAMGSVLS TDSGKSAPASATARALERRRDP 
ELPVTS FDCAVCLEVLHQP VRTRCGHVFCRS C I ATS L KNNKWTC 
PYCRAYLPSEGVPATDVAKRMKSEYKNCAECDTLVCLSEMRAHI 
RTCQKYIDKYGPLQELEETA 


6746 


i 110 


492 


GATGAT4AE^PARkRRKRR^TPLtS5TLPSQATEKSSYFQTTEI ' 

SLWTWAAIQAVEKKMESQAARLQSLEGRTGTAEKKIiADCEKMA 

VEFGNQLJBGKWAVLGTLLQEYGLLQRRLENVENLLRNRN 


6747 " 


247 


484 


EAVTFKDVAWFTBEELGLLDIiAQRKIiYRD VMLENFrktt .r'si vnu ~ 
QPFHRDTFHFIiREBKFWMMDIATQREGNSVYAGVC 


6748 


201 


665 


MTTFKEAVTFKDVAWFTEEELGLLDPAQRKLYRDVMLENFRNL 
LSVGNQPFHQDTFHPLGKEKFWKMKTTSQREGNSGGKIQIEMET 
VPEAGPHEEWSOQQIWEQIASDLTRSQNS I RNSSQFFKEGDVPC 
Q I EARLS I S XVQQXP YRCNECKQ 


6749 


95 


719 


RREVKGGDGVCPRARGSPQSQQFPSCAGGGEGLiQQSGEAIiDGAM " 
SAGGPCPAAAGGGPGGASCSVGAPGGVSMFRWLEVLEKEFDKAF 
VDVDL LLGE I DPDQAD I TYEGRQ KMTS LS S CFAQL CHKAQS VS Q 
INHKLEAQIiVDLKSELTEl'QAEKWLEKEVHDQLLQLHS IQLQL 
HAKTGQSADSGTIKAKLSGPSVEBLBRELKAN 


6750 


3 


428 


S C E S RR PG AKWVWASG AL P RDTTGLGS EQ PS G D V AQ SNRATMG T 
TAPGPIHLLELCDQKLMEFLCNMDNKDLVWLEEIQEEAERMFTR 
EFS KE PELM P KT PSQKNRRKKRR I S YVQ DENRDP I RRRLSRRKS 
RSSQLSSRR 


6751 


152 


1417 


PTKATEMAGASVKVAVRVRPFNSREMSRDSKCIIQMSGSTTTIV 
NPKQPKETPKSFSFDYSYWSHTSPEDINYASQKQVYRDIGEEML 
QHAFEG YNVCI FAYGQTGAGKS YTMMGKQEKDQQG I I PQLCEDL 
FSRINDTTNDNMSYSVEVSYMEIYCERVrdIiLNPKNKGNLRVRE 
HPLLG PYVEDLS KLAVTS YKD1QDLMDSGNKARTVAATNMNETS 
SRS HAVFNI I FTQKRHDAETNITTE KVS Kl SLVDLAGS ERADST 
GAKGTRLKEGANINKS LTTLGKVI S AI*AEMDSGPNKNKKKKKTD 
FIPYRDSVLTOLLRENMGNSRTAPIVAAI^PADINYDETLSTLR 
YADRAKQIRCNAVINEDPNNKLIRELKDEVTRLRDLLYAQGLGD 
I T DMTNALVGMS PS SS LSALS SRNV 


6752 


24 


1834 


RNCVPPIiGCYRSRVKFHSD IKMQ YSHHCEHLLERIjNKQREAGFL " 

CZDCTIVIGEFQFKAHRNVLASFSEYFGAIYRSTSENNVFLDQSQ 

VKAIX5FQKIiLEFIYTGTIiNIJ)SWNVKEIH 

KI KMEDFAFIANPSSTE ISS ITGNI ELNQQTCLLTLRD YNNREK 
SBVSTDLIQANP KQG ALAKKSS QTKKK KKAFNS PKTGQNKTVQY 
PSDILENASVELFLDANKLPTPWEQVAOINDNSELELTSVVEN 
TFPAQDIVHTVTVKRKRGKSQPNCALKEHSMSNIASVKSPYEAE 
NSGEELDQRYSKAXPMCOTCX3KVFSEASSLRRHMRIHKGVKPYV 
CHLCG KAFTQCNQLKTHVRTHTGEKFYKCELCDKG FAQKCQLVF 
HSRMHHGEEKPYKCDVCNLQFATSSNLKIHARKHSGEKPYVCDR 
CGQRFAQAS TLTYHVRRHTGE KP YVCDTCGKAFAVS SS L ITHS R 
KHTGEKPFICELCGNSYTDIKNLKKHKTKVHSGADKTLDSSAED 
HTLSEQDS IQKS PLSETMDVKPSDMTLPLALPLGTEDrtHMLLPV 
TDTQS PTSDTtiLRS TVNG YS E PQL I FLQQL Y 


6753 


2 


1305 


VPSLPYPPQKWAHTEFTTSSDSETANGIAiCPDPVMPGGEEKAS 
PFGIKLRRTNYSLRFNCDQQAEQKKKKRHSSTGDSADAGPPAAG 
SARGEKEMEGVAIiKHGPSLPQERKQAPSTRRDSAEPSSSRSVPV 
AHPGPPPASSQTPAPEHDKAANKMPLAQKPALAPKPTSQTPPAS 
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ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

ami tin *s A r9 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F* Phenylalanine, G=Glycine, 
H°Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M«Methionine, N=Asparagine, 
P=Proline, Q«Glutaraine, R=Arginine, 
S=Serine, T»Threonine, v=Valine, 
W»Tryptophan, Y=Tyrosine, X«Unknown, *»Stop 
Codon, /=posslble nucleotide deletion, 
\=possible nucleotide insertion) 








PLS KLSRP YLVELLSRRAGRPDP E PSEPS KEDQBSS DRRPPSP P 
GPEBRKGQKRDEBEEATERKPASPPLPATQQEKPSQTPBAGRKE 
KPMLQSRHSLDGSKLTEKVETAQ PLWITLALQKQKGFREQQATR 
EERKQAREAKQABKLS KENVSVSVQPGSSSVSRAGSLHKSTAIjP 
EEKRPETAVSRLERREQLKKANTLPTSVTVE I SYSSPAAPLVKE 
VSKRFSSPDDAPVSSEPAWLALAKRKAKAWSDCPLIIK 


6754 


2 


413 


F VRRRRRRLGGPE VNTMS S LHKS R I AD FQDVLKE PS I ALEKLRE 

LSPSG1PCEGGLRCLCWKILLNYLPLERASWTSILAKQRELYAQ 

FLREMIIQPGIAKANMGVSREDVTFEDHPLNPNPDSRWNTYPKD 
NEVLL 


5755 


298 


1343 


PGLQLQVAIiEADWFLDMPGGRRGPSRQQLSR^ALPSLQTLVGGG 
CGNGTGLRNRNGSAIGLPVPPITALITPGPVRHCQIPDLPVDGS 
LLFEFLFFIYLLVALPIQYrNIYXTVWWYPYNHPASCTSLNFHL 
IDYHLAAFITVMLARRLVWALISEATKAGAASMIHYMVLISARL 
VLLTLCGWVLCWTLVNLFRSHSVLNLLFLGYPFGVYVPLCCFHQ 
DSRAHI^TDYNYVVQHEAVEESASTVGGtAKSKDFLSLLLESI, 
KBQFNNATPIPTHSCPLSPDLIRNEVECLKADFNHRIKEVLFMS 
LFSAYYVAFLPLCFVKVSGYLTFMCFLDLCVNYINWVFLV 


6756 


180 


754 " 


IERALGSLPl>S I PVSWGSLRTLKYQQQPLRPKVJ^CQTRVQCHD " 
LRSLQPQPPGLKQSFCLRVU5LQTGATTPGLRDLTCKELI ILTE 
REAQKRKKRKEKESGMALTQGPLTFRDVAIEFSQEEWKSLDPVQ 
KALYWDVMDBNYRNLWLGKDNFALEVKICPRVFLYFLCCLSWE 
PFHYLTETEALLTHK 


6757 


2 


459 


NSRVEAPEAHSRESQGSDAMRKHLSWWWLATVCMLLF^HLSAVQ 
TRG I KHRI KWNRKAIiPSTAQITEAQVAENRPGAF I KQGRKLDID 
FGABGNR YYEAN YWQ FPDG I HYNGCSEANVTKBAFVTGC INATQ 
AANQGSFQKPDNKLHQQVLW 


6758 


1 


1008 


ASGPELPGRRFRDRAPWI^ARLLRGVIAVWVSLSALGPGSFCRR 
RVPSLAQIjGHSEAAPSPDDVRWSRVPDRCPEERDRAWPPPPPPs 
LPPSFRRNMANNS PALTGNSQPQHQAAAAAAQQQQQCGGGGATK 
PAVSGKQGNVL PL WGNEKTMNIiNPMI LTN I LSS P YFKVQLYELK 
TYHEWDE I Y FKVTHVEP WE KGSRKTAGQTGMCGGVRGVGTGG I 
VSTAFCLLYKLFTLKLTRKQVMGLITHTDS P YIRALGFMYIRYT 
QPPTDLWDWFESFX.DDEEDLDVKAGGGCVMTIGEMLRSFLTKLE 
WFSTLFPRI PVPVQKNIDQQ I KTRPRKI 


6759 


1 


513 


RKHNFHSIiDGTSTRAFHPQTGLPLLSSPVPQRKTQSGCFDLDSS 
LLHLKSFSSRSPRPCLNIEDDPDIHEKPFLSSSAPPITSIiSLLG 
NFEESVLNYRFDPLGr VDGFTAEVGASGAFCPTH1.TLPVEVS FY 
S VSDDNAPS P YMG VI TLESLG KRGYRVP PSGTIQWCVL 


6760 
67$1 


239 


606 


^iiS KKKGL S AEEKRTRMME I F S ETKD VFQLKDLEKI APKE KGIT 
AMSVKEVLQSLVDDGMVDCERIGTSNYYWAFPSKAIJlARKHKIiE 
VLESQLSEG SQ KHAS LQKS I EKAKIGRCETEERT 




29 


1733 


BRTJ^GLREVAAPSDVADAAVSRRGRCCCC^CTQTQVAQDCP^ - 
55SSVQRCELSLFQSLHTMTSKKLVNSVAGCADDALAGLVACNP 
NLQLLQGHRVALRSDLDSLKGRVALLSGGGSGHEPAHAGFIGKG 
nil i JavuvaAv JrrSPAVGS liiAAIRAVAQAGTVGTLLIVKNYTGD 
RLNFGLAREQARAEG I P VEMWIGDDSAFTVLKKAGRRGIjCGTV 
LlHKVAGALAEAGVGLEEIAKQVNVVTKAMGTLGVSIiSSCSVPG 
SKPTFEI>SADEVEI^I^IHGEAGVRRIKMATADEIVKLMLDHMr 
NTTNASHVP VQPGSSVVMMVNNLGGltS FLELGI IADATVRSLEG 
RGVKIARALVGTFMSALEMPGISIjTLLLVDEPLLKLIDAETTAA 
AWPKVAAVS I TGRKRS RVAP AE PQE APDSTAAGGS ASKRMAL VL 
ERVCSTIJjGLEFJILNALDRAAGDGDCGTTHSRAARAIQEWLKEG 
PPPAS PAQLLSKLSVLLLEKMGGS SGALYGItFLTAAAQPLKAKT 
SLPAWSAAMDAGLEAMQKYGKAAPGDRTMIiDSLWAAGQEL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 

¥r% f i rot- 

amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
(A= Alanine, (^Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
HoHistidine, lolsoleucine, K=Lysine, 
LcLeucine, M=Methionine, N=Asparagine, 
Pt»Proline, Q=Glutamine, R»Arginine, 
SsSerine, T=Threonine, V=Valine, 
W=»Tryptophan, Y=Tyrosine, X=Unknown, *»stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 


6762 




613 


ASTISWRLCVAQAEARKPVPVAGERAGGGAMWFMYLLSWLSLFI 
QVAPITLAVAAGLYYIiABLIBBYTVATSRIIKYMIWFSTAVIilG 
LYVFERFPTSMIGVGLFTNLVYFGLLQTFPFIWLTSPNFILS CG 
LVWNHYLAFQFFAEEYYPFSBVLAYFTFCLWIIPFAPPVSLSA 
GBNVLPSTMQPGDDWSNYFTKGKRQK 


67^3— 


2 


760 


SGPDFPGRRFRGCCCVRPPAGAGMELGGHWDMNSAPRLVSETAE 
RKQEQKTGTEAB AADSGAVGARRFLLCLYLGGFLDLFG VS MWP 
LliSLHVKSLGASPTVAGIVGSSYGILQIjFSSTLVGCWSDWGRR 
S SLLACI LLSALGYLLLGAATNVFL FVLARVPAGIFKHTLS ISK 
ALLSD WPEKERPLVI GHFNTASG VG FT LG P WGG Y J iTELEDGF 
YliTAFICFLVFILNAGLVWFFPRREAKPGSTE 


6764 


BO 


438 


LKKMDTMMLSVRNLFEQLVRRVEILSEGNEVQFIQLAKDFEDFR 

XXWQRTOHBLGKYIOttilMKAETERSA^ 

RQRAEADCEKLERQIQLIREMLMCDTSGSIQ 


6765 


3 


550 


ARYSRVDHFCRRRCRAVARAPRFLLQFPSGPSRHFLAACVARWL ~ 
RGSVIjVSEAIiSGSAKDGIVTEVAVGVKRGSDELLSGSVLSSPNS 
NMS SM WTANGNDS K KFKGEDKMDGAPSRVLHIRXLPGBVTETE j 

vialglpfgkvtnilmlkgknqaflelateeaaitngnyysavt 

PHLRNQ * \ 




1 


1287 

• 


eggsfkasltwlwplgemklhcevevisrhlpalglrnrgkgvr ! 
avi^lcqqtsrsqppvrafllistlkdkrgtryelrenieqfft 1 
kfvdegkatvrlkeppvdiclskansssijcgflisamriiahrgcn 
vdtpvstltpvktsefenfktkmvitseocdypiisknfpysliehii ' 
qtsycglvrvdmrmlclkslrkldlshnhikklpatigdlihlq 
elnitndnhles fs valchstlqkslwsldlsknki k3vlp vqfcq 

LQELKNLKLDDNELlQFPCKIGQLIlTLRFLSAAllNKLPFliPSEF 
RNLSLEYLDLFGNTFEQPKVLPVI KLQAPLTLLESSARTILHNR 
IPYGSHIIPFHLCQDI^TAKICVCXSRFCLNSFIQGrTTWNIiHSV 
AHTWLVDNLGGTEAP 1 1 S YFCSLOCYVNSSDI 


6161 ~ 


336 


919 


APMICLCSSDLQFRYKEAFLRDRGLQIGYCSVDDDPRMKHFLNV 

gri^sdneykkdfaksrsqfksstdqpgllqakrsqqiIasdvhy 
rqplpqptcdpeqlglrhaqkahqlqsdvkyksdlnltrgvgwt 
P PGS y kvemarraaelanarg lglqgayrgaeaveagdhqs GE V 

NPD ATE I LHVK KKKALLL 


616$- 


2 


363 


pgstiscyllsbgslplcmqvacgbekhraptmktlrarfkkts 
lrls p tdlgs cp pqgpcp i pkpaargrrqs qdwgks derllqav 
enndaprvaaliarkglvptkldpbgksafhl 


67*9 


284 


396 


MSTPDFSTAEJWQELiANEVSCLKAWIjTLMJjQAMGQAD 


6770-' 


1 


3*7 


QRNYQVIWSSTt4AKLHDYYKDEVVKKi^TBFNVNSVMQVPRVEK 

itlnmgvgeaxadjoolildnaaadlaaisgqkplitkarksvagf 
kirqgypigckvtlrgermwefferlitiavprirdfrglsaks 


6771 


3 


3 78 


APAGTLAMTGK5VKDVDRYQAVLANIiLjLEEDNKFCADCQSKGPR 
WASWNIGWICIRCAGIHRNLGVHISRVK5VNLDQWTQEQIQCM 
QEMGNGKANRLYEAYLPETFRRPQIDPYLFWSNLEG 


6772 


1 


1400 


jvwmx* ia\g[\3Fi a vinwr j»IM X v X ioii \CiKK IJJluio JUoIj»Jj.»..AoSYDIAA 

CLCLTFVSYFGGSG\HKPRWLGWGR\VLMGTGSLVFALPHFTAG 
P* *GWKLDAGVRTCPANPR\PVCAG\HTSGLSRYQLVFMLGQFL 
HGVGATPLYTIX3VTYLDENVKS3C3PIYIAIFYTAAILGPAAGY 
L I GGALLNI YTEMG RRTELTTESPLWVGAWWVGFLGSGAAAF FT 
AVPILGYPRQLPGSQRYAVMRAAEMHQLKDSSRGEASNPDFGBCT 
IRDLPLSIWI^LKNPTFILLCIaAGATEATLITGMSTFSPKFLES 
QF5LS AS EAATLFG YLWPAGGGGTFLGGFFVNKLRLRGSAV I K 
FCLFCTWSLl^ILVFSLHCPSVPMAGVTASYGGSL.LPEGHLNL 
TAPCNAACSCQPBHYSPVCGSDGLMYFSLCHAGCPAATETNVDG 
QKVYRDCS CI PQNLSSGFGHATAGKCTST 
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SEQ 
ID 
NO: 


| Predicted " 
beginning 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid eegmenc containing signal pepti"H£~ 
(A«Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
u-.ueucine, ro-wecnxonine, NaAsparagine, 
P=Proline, G=Glutaraine, R=Arginine, 

St=Ssrine . T=ThrRr>n-i T/_Tr»i i ma. 

W-Tryptophan, YoTyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6773 


1 


630 


x-nanjrnan,n.ip>j\n£ t nx V VJjrv IvjkPuHFPFQYHRQLYHKCTHIC(3 — 

RPGPQPWCATTPNFDQDQRWGYCLEPKKVKDHCSKHSPCQKGGT 
CVNMPSGPHCLCPQHLTGNHCQKEKCFEPQLLRFPHKNEIWYRT 
EQAAVARCQCKGPDAHCQRLASQACRTNPCLHGGRCLEVEQHIU, 
CHCPVGYTG?FCDVGE*GSGASRRPAPRWDGLAR 


6774 


X46 


389 


LTELSDQQ YFLFFI LS S / WVPTFLSMD VDGRVIKADS FS KI l"gs~ 
GLRIGFLTGPXPLI ERVILKIQVSTLHPSTFNQLM ISQ 


6775 


104 


614 


TCPSQIJ?VLTARGGRRAPSPQLWTLVI^ALlEtekWRSkRlLRMl?s~ 
GRPETMENLPALYTI FQGEVAMVTDYGAF2 KI PGCRKQGLVHRt 
HMSSCRVDKPSEIVDVGDKVWVKLIGREMKNDRIKVSLSMKVVii 
QGTGiCDIiDPNNV\SLS KKRGGGDPSR ITLGRRSPLRLS 


6776 


3 


HOB " 


HERHERHEGALSQDALLRISIPLDSNMRPEKCRRFVHPQWQLElT" 
LNGTFPKTSDAEWEPCVDGWVYDRISFSSTIVTEWDLVCDSQSL 
TSVAKFVTMAGMMVGGILGGHLSDRFGRRPVLRWCYLQVA1VGT 
CAALAP-rFLIYCSLRFLSGIAAMSLITNTIMLIAEWATHRFQAh! 
GITLGMCPSGIAFMTLAGLAFAIRDWHILQLWSVPYPVI FLTS 
SWLLESARWLI INNKPEEGLKELRKAAHRSGMKNARDTLT£jEI t, 
KSTMKKELEAAQKKKPFLGERLHMPNICKRISLLPFTKFANFKA 
YFGLNLHG/ LKHLGNNVFliLQTLFGAV/TPPGQLVLHLGHWGSG 
RVS S RGRVNCLGLFVLQVW 


6777 


779 


63 


CFFHGPAWRDCEVRATFAKKQGQSGIISCIAFSPAQPL^AaSST^ 
GRS LGLYAWDDGSPLALLGGHQGG i thlcfhpdgnrffs g arkd 

abllcwdlrqsgyplwslgrevttnqri yfdldptgqflvsgst 
sgavsvwdtdgpgndgkpepvlsflpqkdctngvslhpslpllg 

HCLPVSVCFLSPTESGGRRRGAOPSLGS PRRHVHLECRLQLWWC 
GGGARLQHP+ ♦ S PRARKGR 


6778 


311 


805 


iqsitdesrgsirrknpantrlrlnvp\ebtagdse/brspeeb^- 

VQADPRIRS AS PKCPTSS PFPKGRS PEGBGET\ DPE KVHFHPGp 
KDKSVAEKW\ KG P\SPVSSEGI KDFFSMKPEWENLNQSNVRRMH 
T\AVRLNEVIVKKSRDAKLVLLNMPGPPRNRNGDENY 


<*779 


2 


r 535 


RALRRQPRLLAANG I E PES MAI SEP I KGSRKPCVNKEEIALKKP^ 
MAKCAWKG PREP PQDARAEAESPG GAS ESDQDGGHESPPKK3CAV 
AWVSAKNPAPMRKKKKVS LGPVS YVLVDSEDGRKKPVMPKKGPG 
SRREASDQKAPRGQQPAEATASTSRGPKAKPEGSPRRATNESRK 


67B0 


3 


403 


hevndnkpeininlmspgkeeisyifbgdpidtfvaLvrVqdkd^ 

SGLNGEIVCKLHGHGHFKLQKTYENNYLILTNATLDREKRSEYS 
LTV1 AEDRGTP S LSTVKH FTVQI ND INENP PHFQRSRY E FVISE 
K 


6781 


1 


1269 


APTRPVFPTLQDLSSSKEPSNSLNLPHSNELCSSLVHPBLSEVS^ 
SNVAPSIPPVMSRPVSSSSISTPLPPNQITVFVTSNPITTSANT 
SAALPTHLQSALMSTVVTMPNAGSKVMVSEGQSAAQSNARPQFI 
TPVFINSSSI IQVMKpSQPSTIPAAPLTTNSGLMPPSVAWGPL 
HIPQNIKFSSAPVPPNALSSSPAPNIQTGRPLVLSSRATPVQLP 
SPPCTSSPWPSHPPVQQVKELNPDEASPQVNTSADQNTIiPSSQ 
STTMVS PLLTNS PGSSGNRRSPVSSSKGKGKVDKIGQILLTKAC 
KKVTGSLEKGEEQYGADGETEGQGLDTTAPGIiMGTEQLSTELDS 
KTPTPPAPTLLKMTSSP VGPGTASAGPSLPGGALPTSVRS 1 VTT 
L VP SEL I SAVPTT KSNHG G IAS ES LAG 


6782 


3 


1327 


RKPTVIRIPAKPGKCLHEDPQSPPPLPAEKPIGNTFSrVSGKLS 
NVERTRNLESNHPGQTGGFVRVPPR L P PRpVNGKTI PTQQ P PTK 
VPPER P PPPKLSATRRSNKJCLP FNRS SS DMDLQKKQSNLATGLS 
KAKSQVFKNQDPVLPPRPKPGHPLYSKYMLSVPHGIANEDIVSQ 
NPGELSCKRGDVL VMLKQTENNYLECQKGEDTGRVHLSQMKL,IT 
PLDEHLRSRPNPFS PPKAPSHAQKP VDSG APHA WLHDFPAEQ V 
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to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A«*Alanine, C«Cysteine, D=Aspartic Acid, E=> 
Glutamic Acid, F«Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S=Serine, ToThreonine, V»Valine, 
W-Tryptophan, Y=Tyrosine, X^Unknovm, *-Stop 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 








DDLNLTSG E IVYLLEKIDTDW YRGNCRNQ IG I FPANYVKVI I D I 
PEGGNGKRECVSSHCVKGSRCVARFEYIGEQKDELSFSEGEI 1 1 
LKEYVNEEWARGEVR£3RTGI FPLMFVEPVEDYPTSGANVLSTKV 
PLKTK KEDSGSNSQ VNS L PAE WCEALHS FTAETS DDLS FKRGDR 
I 


6783 


3 


1750 


S YHHHHAQQSAAAS PNLTASQKTVTTT S M ITTKTLPLVLKAAT A 
TMPAS WGQRPT I AMVTA INSQKAVLSTDVQNT P VNLQTS SKVT 
GPGAEAVQI VAKNTVTLQVQATPPQP I KVPQFI PP PRLTPRPNF 

PTSQNS IH P VR VVNGQTATI AKTFPMAQLTS I VXATPGTRLAGP 
QTVQLS KPSLEKQTVKSHTE TDEKQTES RTITP PAAPKP KRE EN 
PQKLAFMVSI/3LVTHDHLEE IQSKRQERKRRTTANPVYSGAVFE 
PERKKSAVTYLNSTMHPQTRKRGRPPKYNAVLGFGALTPTS PQS 
SHPDSPBNEKTBTTFTFPAPVQPVSLPSPTSTDGDIHEDFCSVC 
RKSGQLLMCDTCSRVYHLDCLDPPLKT I PKGMW1 CPRCQDQMLK 
KEEAIPWPGTLAIVHSYIAYKAAKEBEKQKLLKWSSDLKQEREQ 
LEQKVKQLSNS I S KCMEMXNT I LARQKEMHSSLEKVKQL I RL I H 
GIDLSKPVDSEATVGAISNGPDCTPPANAATSTPAPSPSSQSCT 
ANCNQGEETK 


6764 


3 


1750 


S YHHHHAQQSAAAS PNLTASQKTVTTTSM I TTKTLPLVLKAATA 
TMPAS VVGQRPTIAMVTAJNSQKAVLSTDVQNTPVNIiQTSSKVT 
GPGAEAVQI VAKNTVTLQVQATPPQP I KVPQFI PP PRLTPRPNF 
L P Q VR P KP VAQNN I P I APAP P PMLAAPQL I QRP VMLTKFTPTTL 
PTSQNS IHPVRVVNGQTATI AKTFPMAQLTS IVIATPGTRLAGP 
QTVQLS KPSLEKQTVKSHTETDE KQTES RT I TP PAAPKP KREEN 
PQiOAFMVSI^LVTHDHLBEIQSKRQERKRRTTANPVYSGAVFE 
tr £iKXVi\o/iv 1 XLiEia i ntiriil KJUitjKPPKYNAVLGFGALTPTSPQS 
SHPDSPENEKTETTFTFPAPVQPVSLPSPTSTDGDIHEDFCSVC 
RKSGQI^CDTCSRVYHIJ)C^DPPIJCTIPKGMWICPRCQDQMLK 
KEEAI PWPGTLAIVHSYXAYKAAKEEEKQKLLKWSSDLKQEREQ 
LEQKVKQLSNSISKCMEMKNTIIiARQKEMHSSLEKVKQLIRLIll 
G IDLS KP VDS EATVGAISNGPDCTPPANAATSTP APSPSSQSCT 
ANCNQGEETK 


' 6785 


1 


528 


LGNTVLH YCSM YSKPECLKLLLRSKPTVDI VNQAGETALD IAKR 
LKATQCEDLLSQAKSGKFNPHVHVEYEWNLRQBE IDESDDDLDD 
KPSPVKKERSPRPQSFCHSSSISPQDKLALPGFSTPRDKQRLSY 
GAFTNQIFVSTSTDSPTSPTTEAPPLPPRNAGKGPTGPPITPHR 


6786 


1820 


1397 


RSPKVLVIJVPTRELANHVSPJ)FKpi\TRiCLTVAkFYGGTSYQSQ 
INHXRNG IDI LVGTPGRIKDRLQSGRLDLS KIJUrVVLDEVDQML 
DLGFAEQVEDIIHESYKTDSEDNPQTLLFSATCPQWVYTVA\KK 
YMKSRYEQVDLDGKMTOKAATTVEHLATOCMWSOPPAVTnnvT n 

VYSGSBGRAXIFCETKKNWEMAMNPHIKQNAQCLHGDIAQSQR 
E I TLKGFREGS FKVLVATNVAARGLDI PEVDLVI QSS P PQDVES 
YIHRSGRTGRAGRTGI CICFYQPRERGQLRYVEQKAGITFKRVG 
VPSTMDLVKSKSMDAIRSLASVSYAAVDFKRPSAQRLIEEKGAV 
DALAAALAHI SGASSFEPRSL ITSDKGFVTMTLESLEEIQDVSC 
AWKELNRKLSSNAVS QITRMCLLKGNMGVCFDVPTTESERLQAE 
WHDSDWILSVPAKLPEIEEYYDGNTSSNSRQRSGWSSGRSGRSG 
RSGGRSGGRSGRQSRQGSRSGSRQDGRRRSGNRNRSRSGGHKRS 
FD* VFYHLVDFLSDFLVDSVYLTGRQIDHLTGLTGL IDHLTSHS 
SVWN 


6787 " 


2646 


2270 


PSSFPKNVPLEEIjEEPPK*KRSGLGSLTPKSQIQNGP*PQTFPF 
FELGS PSGVIS AHCNLRLLGS SDS PAPASRVAGI IGTCHHAWLI 
LVFL VBMG FHHVGQAGLKLLTL\ V I HP PWP PKVLGLQT 


6788 


16 


336 


GGTVDLRXDMLAVSVLAAVRGGR/ATVRRVRESNVLHEKSKGKf - 
REGAEDKMTSGDVLSNRKMFYLLKTAFPSVQINTEEHVD\ELDQ 
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residue of 
amino acid 
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Predicted end 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D«Aspartic Acid, B« 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
HeHistidine, I«=Isoleucine, KoLysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P«Proline, Q=Glutamine, RcArginine, 
ScSerine, TaThreonine, V«Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 








BVILWGS * DS *G YPKGK* LLPKBVPSR/ 1 RVLLSGLTPLDATQEV 
FTEDLS K\ YVTTMVCVAVWG KPMLG V I HKP FSE YTAWAMVDGGS 
NVKARSS YNBKTPRI WSRSHSGMVKQVALGTFGNQTTT I PAGG 
AGVKVLAI*IJ3VPDKSQEKADLYIHVTYIIQCWD1CAGNAILKALG 
GHMTTLSGEEISYTGSDGIEGGLLAS irmnhqalvrklpdlekt 
GHK 


6789 


2 


678 


gnginvlkiapesaikfmayeqikrlvw**pgds*gf/yerlva 

GS LAGAI AQS S I Y PMEVLKTRMALRKTGQ YSGMLDCARR I LARE 
GVAAFYKGYVPNMLGIIPYAGIDLAVYETLKNAWLQHYAVNSAD 
PGVPVLLACGTMSSTCGQLASYPLALVRTRMQAQAS IEGAPEVT 

MSSLFKHILRTEGAFGLYRGLAPNPWKVIPAVSISYWYENLKI 
TLGVQSR 


6790 


2 


4068 

r 


APPAGRRRMQAAPRAGCGAALLLWIVSSCLCRAWTAPSTSQitCD 
EPLVSGLPHVAPSSSSSISGSYSPGYAKINKRGGAGGWSPSDSD 
HYQWLQVDFGNRKQISA1ATQGRYSSSDWVTQYRMLYSOTGRNW 
KPYHQDGNIWAFPGNINSDGWRHELQHPI IARYVRI VPLDWNG 
EGRIGLRIEVYGCSYWADVINPIXSHVVLPYRFRNKKMKTLKDVI 
ALNFKT S ESEGVI LHGEGQQGD YITLBLKKAKLVLS LNLGSNQL 
GPI YGHTS VMTGSLLDDHHWHS WT ERQGRS I NLTLDRSMQHFR 
TNGEPD YLDLD YE I TFGQ I P PS GKPSS S S RKNFKGCMES INYNG 
VN I TDLARRKKLEPSNVGNLS PS C VEP YTVP VFFNATS YLEVPG 
RIiNQDLFS VSFQFRTWNPNGLL VFSHFiU)NLGNVE IDLTESKVG 
VHINITQTXMSQIDISSGSGLNDGQWHEVRFIjAKENFAILTIDG 
DEASAVRTNSPLQVKTGEKYFFGGFLNQMNNSSHSVLQPSFQGC 
MQLIQVDDQLVMLYEVAQRKPGS FANVS IDMCAI IDRCVPNHCE 
HGGKCSQTWDSPKCTCDETGYSGATCHNS IYEPSCEAYKHIiGQT 
SNYYWIDPDGSG PLG PLKVYCNMTE DKVWTI VSHDLQMQTPWG 
YNPEKYSVTQIiVYSASMDQISAITDSAEYCEQYVSYFCKMSRLL 
NTP DGS P YTWWVGKANEKHYYWGGSGPG 1 QKCACGI ERNCTDPK 
YYCNCDAD YKQWRJCDAGFLS YKDHLP VS QVWGDTDRQGSEAKL 
SVGPLRCQGDRNYWNAASFPNPSSYLHFSTFQGETSADISFYFK r 
TLTPWGVFLENMGKEDFIKLBLKSATEVSFSFDVGNGPVEIWR 
S PTPLND0QWHRVTAERNVKQAS LQVDRL PQG; IRKAPTEGHTRL 
EL YSQLFVGGAGGQOGFLGCIRS LRMNGVTLDLEERAKVTSG F I 
SGCSGHCTSYGTNCENGGKCLERYHGYSCDCSNXAYDGTFCNKD 
VGAFFEEGMWLRYNFQAPATNARDSSSRVDNAPDQQNSHPDLAQ 
EE XR FS FS TTKAPC I LLYI SS FTTDFLAVLVKPTGSIiQ IRYNLG 
GTREPYWIDVDHRNMANGQPHSVNITRHBKTrFtiKLDHYPSVSY 
HL PS SSDTLFNS P KS LFLGKVIETGKIDQE IHKYNTPG FTGCLS 
RVQFNQIAPLKAALRQTNASAHVHIQGELVESMCGASPLTIiSPM 
SSATDPWHLDHLDSASADFPYNPGQGQAIRNGVNRNSAIIGGVI 
A\W1FTPSLCTP\VLP*SR*HVSPHKGTLPIPNEAKGAGSRQK 
KPGRRPSM23NDPPTSQRPIDESKKEWPS£E|RGGYLAMG 


6791 


1801 


1193 


TGHEGAKGE KGDKGDLGPRGE RGQHG PKGEKG YPG I PP E L/ PGW 
SAW* SWIjTAASTKVQAILLPQPLE * LGLQ IAFMASLATHFSNQ 
NSGI I FSS VETNIGNFFDVMTGRFGAPVSGVYFFTFSMM KHEDV 
titi V X V il^nWgNTVFSMYSYEMK^ 
LRMGNGALHGDHQRFSTFAGFLLFETK 


6792 


33 


1073 


VRHTNWGVDMYLFSLGSESPKGAIGHlVSTEKTILAVBRKKVLti " 
PPL WNRTF 3 WGFDDFS CCLGS YGSDKVLMTFENLAAWGR CLCAV 
CPSPTTIVTSGTSTVVCVWELSMTKGRPRGLRLRQALYGHTQAV 
TCLAASVTFSLLVSGSQDCTCILWDLDKLTHVTRLPAHREGISA 
IT I S DVSGT I VSCAGAH LS LWNVNGQ P LAS 1 TTAWG PEGAITCC 
CLMEGPAWDTSQIIITGSQDGMVRVWKT/VGCEDVCSWTASRRG 
APGSASKPKRPQVGEEPGliESRAGR* HCFDREAQQNQP\ PVTAL 
RVSRNHTKLLVGDERGR I FCWSADG * EERGSRGSGTTVPG 
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Predicted 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HoHistidine, I=Isoleucine, K«Lysine, 
Ii=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S -Serine, T=Threonine, VsValine, 
W=Tryptophan, YoTyrosine, X*Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 


6733 


2340 


805 

• 


GRKE ANY \ YGSLTQAGTVS LG LDAEGQBVFVP PSAVL PMVAPND 
LVFDGWDISSLNLAEAMRRAKVLDWGLQEQLWPHMEALRPRPSV 
YIPEFIAANQSARADNLIPGSRAQQLEQIRRDIRDFRSSAGLDK 
VIVLWTANTERFCEVI PGLNDTAENLLRTIBLGLEVS PSTLFAV 
AS I LEG CAFLNG S PQNTL V PGALBLAWQHRVFVGGDDFKSGQTK 
VXS VLVDFLI GSGLKTMS I VSYNHLGNNDGENLSAPLQFRS KEV 
5 KSNWDDMVQSNPVLYT PGEEPDHCWI KYVPYVGDS KRALDE 
YTSELhujGGTNTLVLHNTCEDSLLAAPIMIiDLALLTELCQRVSF 
CTDMDPEPQTFHPVIiSLliSFLFKAPLVPPGSPVVNALFRQRSCI 
ENILRACVGLPPQNHMLLEHKMERPGPSLKRVGPVAATYPMLNK 

KGPVPAATNGCTGDAMfiHIiTJRFP PMPTT*f3Pf3WT\7QP T.T7T .Da AD 
HDPTLKAPTNKGRCHFSPPSTWGSWGL 


6794 


169 


1344 


DDVXRKPEA^*EKPGPPSRPGVRGGRBRAGGRGSHGARS^T" 
EPAPPAPAPPEDHPDEEMGFTIDIKSFLKPGEKTYTQRCRIiFVG 
NLPTDITEBDFKRIiFERYGEPSEVFINRDRGFGFIRLBSRTLAE 
IAKAELDGTILKSRPLRIRFATHGAALTVKNLS PWSNEIi LEQA 
FSOFGPVSKAWWnDRGR ATf5K"f3 FVP PA A If P P & P JTA T .T? D rvrw 

AFLLTTTPRPVIVEPMEQFDDEDGLPEKLMQKTQQYHKEREQPP 
RFAQPGTFE FE YASRWKALDEMEKQQREQVDRNI REAKE KI*EAE 
MEAARHEHQLMLMRQDLMRRQEELRRLEELRNQELQKRKQ IQLR 
HEEEHRRREBEMIRHREQEELRRQQEGFKPNYMENYVCHPIiR 


$79$ 


1740 


1010 


TMSKSAVKISIJ)LLSNPLCSQIX?DLLNMVTALDTAMKPJ^AFNQ 
BKVNQIQKTVIEPLKKFGSVPPSLNMAVKRREQALQDYRRLQAK 
VEKYBEKEKTGPVLAKIiHQAREELRPVREDFEAKNRQLLEEMPR 
FYGSRLDYFQPS FESL X RAQWYYS EMHKIFGDLS HQLDQPGHS 
DEQRERENEAKLS ELRALS IVADD 


6796 


48 


683 


GKEIQIPTIKI^I^LE*PVGALaKGWSF*VsHV7^QLGW 
LTRAVRSS WRWELCVSAQEWSQRSA* 3S P3PVGACPSLNPPET 
SVQEGRDCWQR* LPRLFSALVGQPGCWPQGAPPERCV* PGRCKW 
HJjQSQVI*R*ERRRCCRCLPRFA*GWRRRHQRLGIjGIHPAPLGST 
SPPHPEGNSQQCRR*GWAAELRLPSSWL*GKLGC* 


6797 


1620 


2X1 


TLGPMAQGLLS ASGTTTEATWTRPTTHLTLI RWWLLTASRVDP P 
ERPPPPPSDDLTU.ESSSSYKNL/DAQIPQ/DWSMS PST3G +RP 
LTSRASS IMRSRTAI PSAS *SRLTTKHTVGGSPSAWRPRPTSRS 
VSTPVSSSTBTTASGSCLTWWSSSPAPCPSSSAPAHSFEASCCK 
TSLWGSCGGSGDGSSACGSGWNLSMAGTSCSSPAMCSPSRAPS* 
RSASRPRTWRATTSAASSWAPRRCWCGWA*SAT*PSSTTTISSS 
PHOGWPCPASCASAAAWLSSTWATASVAGSCWGPIM*SSAHSPW 
CLSACSRSSMGTTCL+RSPP\ SGASRAAAAWCGSSPS STFTPSS 
ASSSTWCSASSSRSSPAPTTPSSIPAAQAQRRASCRPTSHSART 
AP P PAS S AAGAAR PAAFS AAAEGTPRRS I RCW 


6798 


3894 


1696 


STISWESLESWLNKATNPSNRQEDWEYIIGFCDQINKELEG*VS 
ALWGQLRGSGLGRGTTMAKEGQPGS PRLSALECVLLVPQ\ PQI A 
VRLLAHKIQSPQEWEALQALTYLGDRVS EKVKTKVIELLYSWTM 
ALPEBAKIKDAYHMLKRQGI VQSDPPIPVDRTLI PS P PPRPKNP 
VFDDEEKSKLLAKLLKSKNPDDLQEANKLIJCSMVREDEARIQKV 
TKRLHTLEEVNNNVRLLSEMLLHYSQEDSSDGDRELMKELFDQC 
ENKRRTLFKLASETEDNDNSLGDIIjQASDNLSRVINSYKTI ieg 
QVINGEVATLTLPDSEGNSQCSNQGTLIDLAEIiDTTNSLSSVLA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSOAEATLGPSSTSNAI, 
SWLDEELLdiGLAOPAPNVPPKESAGNSQWHLLQREQSDLDFFS 
PRPGTAACGASDAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 
APS AGS SL FSTGVAPALAPKVB PAV PGHHGLALGNSALHHLDAL 
DQLLEEAKVTSGLVKPTTSPLIPTTTPARPLLPFSTGPGSPLFQ 
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Predicted 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, KsLyslne, 
I>=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, VcValine, 
W«Tryptophan, Y«Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\epossible nucleotide insertion) 








PLSFQSQGSPPKGPELSIiASIHVPLESIKPSSALPVTAYDKNGF 
R I L FH PAKECPPGRPDVLWWSMLNTAPLP VKS I VLQAAVPKS 
MKVKI^PPSGTBLSPPSPIQPPAAITQVMLIANPLXEKVRLHYK 
LTFALGEQLSTBVGEVDQPPPVEQWGNL 


6799 


3894 


1696 


STISWESLESWLNKATNPSNRQEDWEYIIGFCDQINKELEG*VS 
ALWGQLRGSGLGRGTTMAKEGQPGSPRLSALECVLLVPQ\PQXA 
VRLILiAHKIQSPQEWEALQALTYLGDRVSEKVKTnCVIELLYSWTM 
ALPEEAKIKDAYHMLKRQGIVQSDPPIPVDRTL1PSPPPRPKNP 
VFDDEEKSK1LAKLLKSKNPDDLQEANKLIKSMVREDEARIQKV 
TKRLHTLEEVNNNVRLLSEMLLHYSQEIDSSDGDRELMKELFDQC 

QVINGE VATLTLPDS EGNS QCSNQGTLI DLAELDTTNS LSS VLA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAL 
SWLDEELLCLGLADPAPNVPPKESAGNSQWHLLQREQSDIoDFFS 
PRPGTAACGASDAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 
APSAGSSLFSTGVAPALAPKVE PAVPGHHGLALGNSALHHLDAL 
DQLLEEAKVTSGLVKPTTSPLIPTTTPARPLLPFSTGPGSPLPQ 
PLSPQSQGSPPKGPEltSLASlHVPLESIKPSSALPVTAYDKNGF 
RILFHFAKECPPGRPDVLWWSMLNTAPLPVKS I VLQAAVPKS 
MKVKLQPPflGTELSPPSPIQPPAAlTQVMLLANPLKEKVRLRYK 


6800 


404 


1646 


RRSPSTGLSPVPQPSSPSLSDYSIPWSIiLLSGTlAWATPGK*AG 
* PQAW*LGLAPAIAFI /G LTRGR KQN KEKMAEGG S GD VDDAGDC 
SGAR YNDWS DDDDDSNES KS I VWYPPWARIGTEAGTRARARARA 
RATRAPJIAVQKRASPNSDDTVLSPQELQKVLCLVEMSEKPYILE 
AAL I ALGNN AAYAFNRD I IRDLGGLP XVAKI LNTRDP I VKEXAL 
I VLNNLS VNAENQRRLKVYMNQVCDDTITSRLNS SVQLAGLRLL 
TNMTVTNEYQHMLANS ISDFFRLFSAGNEETKLQVLKLLLNLAE 
NPAMTRELLRAQVPSSLG \SLFNKKENKEVILKLLVI FENINDN 
FKWBENEPTQNQFGZGSIiFFFLKEFQVCADKVLGIBSHHDFLVX 
VKVGKFMAKLAEHMFPKSQE 


6801 


2 


1755 ■' 


SAEEFESQQASVTMHDVnAESFBVLVDYCYTGRVSLSEANVERL 
YAAS DMLQLE YVREACAS FLARRLDLTNCTAILKFADAFGHRKL 
RSQAQS Y I AQNFKQLSHMGS IREETLADLTLAQLLAVLRLDSLB 

YLEGLLTKP I VKKYCIiDVIEGALQMRYGDLLYKSLVP VPNS SS S 
/ R* QQQLS C I CSRKSTPETGYVCQGDGDLLWTPQRSLS \RYDP Y 
S GDI YTW PS PLTSFAHTKTVTSSAVCVS PDHD I YLAAQPRKDLW 
VYKPAQNS WQQLADRLLCREGMO VA YLNGY I YI LGGRDP I TCVK 
LKEVEC YS VQRNQWALVAPVPHS FYS FEL I WQNYLYAVNS KRM 
LCYDPSHNMWLNCASLKRSDFQBACVFNDEIYCICDIPVMKVYN 
PARGEWRRZ SN I PLDS ETHNYQ I VNHDQKLLL ITSTTPQW KKNR 
VTVYEYDTREDQWINIGTMLGLLQFDSGFICLCARVYPSCLEPG 
QS FITEEDDARS ES STEWDLDGFS E LDSESG SS SSFSDD EVWVQ 
VAPQRNAQDQQGSL 


6802 


157 


1341 


ETFPLFFFLLS KTPGKTASMAHFVQGTSRM I AAESSTEHKECAE 
PSTRKNLMNSLEQKIRCLEKQRKEliLEVNQQWDQQFRSMKELYE 
RKVAELKTKLDAAERFLS TREKDPHQRQRKDDRQREDDRQRDLT 
RDRLQREEKEKERI4NEBLHELKEENKLLKGKNTLANKEKEHYEC 
EI KRLNKALQDALNIKCS FS E DCLR KS RVEFCHEEMRTEM EVLK 
QQVQIYEEDFKKERSORERLNQEKEELOQINETSQSQLNRLNSQ 
IKACQMBKEKLEKQLKQMYCPPCNCGLVFHLQDPWVPTGPGAVQ 
KQREHPPDYQWYALDQLPPDVQHKAN/DWCIiAPPPVCCQAG/PR 
TPGLK*SSCLMLPKC*NFRFILSKBSPSVEVHTNRERQQATRER 
G 


6803 


1 


2203 


KLSGRPYRHMGVLGTS KLYDIRKTI FTFTPQF IDQQQFYLAIiDN 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A= Alanine, CaCysteine, D=Aspartic Acid, E=> 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LoLeucine, M=Methionine, N=Asparagine , 
P«Proline, Q=Glutamine, RoArginine, 
S -Serine, T=Threonine # V=Valine, 
W»Tryptophan, Y«*Tyrosine, X=UnJcnovrn, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KMIVEMLRTDLSYLCSRWRMTGQPTITFPISHSMLDEDGTStNS 
S I LAALRKMQDGYFGGARVQTGKLSE FLTTSCCTHLS FMDPGPE 
GKLYSEDYDDNYDYLESGNWMNDYDSTSHARCGDEVARYLDHLL 
AHTAPHPKLAPTSQKGGLDRFQAAVQTTCDLMSLVTKAKELHVQ 
NVHMYLPTKLFQASRPSFNLLDS PHPRQENQVPSVRVE IHLPRD 
QSGEVDFKALVLOLKETSSIjQEQADILYMLYTMKGPDWNTELYW 
ERSATVRELLTELYGKVGEIRHWGLIRYISGIIjRKKVEALDEAC 

tdklshqkhltvglppeprektisaplpyealtqlideasegdm 
sisiltqeimvyiamymrtqpglfaewfrlrigliiqvmateia 

HSUICSAEEATEGIiMNLSPSAMKNLLHHILSGKEFGVERK/SVR 
PTDSNVSPAISIHEIGAVGATKTERTGIMQLKSEIKQVEPRRLS 
I SAESQS PGTSMTPSSGS FPSAYDQQSSKDSRQGQWQRRRRLDG 
ALNRVPVG EYQKVWKVLQKCHGLS VEGFVL P SSTTREMTPGEI K 

fsvhves\vlnvllrpeyrqllveailvltmladieihsigsii 
avekivhiandlflqeqktlgp\ddtmlakdpasg\ictlr\yd 

SAPSGRFGTMTYLS\RAA\ATYVQEFLP\HS icamq 


6804 


1 


951 


G S PGKKEE KAKNTKES LCMENS SNS S SDEDEE ETKAKMTPTKKYN '" 

GLEEKRKSLRTTGFYSGFSEVAEKRIKLLNNSDERLQNSRAKDR 

KDVWS S IQGQWPKKTLKE LFSDSDTE AAAS P PHPAPEEGVAEES 

LQTVAEEBSCSPSVELEKPPPVNVDSKPIEEKTVEVNDRKAEFP 

SSGSNFSA*IPLPYLHLNRIjHQSL*QKGSRQQSSVTVSEPLAPN 

QEEVR5IKSETDSTIEVDSVAGEIiQDLQSERE*LASRF*CX3CKL 

KQ**SARTRTS*KSLYRSEKSERCSGRRKFIKKAEKKP*SNSGK 

QQKEGKRHK 


1 6805 


1539 


206 

r 


RQPDLKYFGKSFDVSVSBSSSLLSNDLPKFADGIKARNRNQNYL 
VPSPVLRILDHTAFSTEKSADIVICDEECDSPESVNQQTQEESP 
I EVHTABD VP I AVEVHAISED YD I ETENNS S ES LQDQTDEE PPA 
KLCKILDK3QALNVTAQQKWPLLRANSSGLYKCELCEFNSKYFS 
DLKQHMILKHKRTDSNVCRVCKESFSTNMLLIEHAKLHEEDPYI 
CKYGDYKTVIFENLSQHIADTHFSDHLYWCEQCDVQPSSSSBLY 
LHFQEHSCDEQYLCQFCEHETNDPEDLHSHWNEHACKLrELSD 
KYNNGEHGQ Y5LLSK1T FDKCKNFFVCQVCG FRS RLHTNVNRHV 
AIEHTKIFPHVCDDCGKGFSSMLE \ IAKHLNSRLSEGI YL»CQYW 
E YS TGQIEDLKI HLD FKHS ADLPHKCSD CLMRFGNERELI S H LP 
VHETT 


£806 


272 


3794 


VALCF PNSDP VMFMDAFYGCLLAELG PV P I E VPLTRKDAG5QQV "~ 
GFLIX3S CGVFIiAIiTTDACQKGLPKAQTGEVAAFKG WP PLS WLV I 
DGKHIAKPPKDWHPLAQDTGTGTAYIS YKTS KEGSTVGVTVSHA 
SLIAQCRALTQACGYSEAETLTNVLDPKRDAGLWHGVLTSVMNR 
MHWSVPYALMKANPLS WI QKVCFY KARAALVKS RDMHWS LLAQ 
RGQRDVSLSSLRMLIVADGANPWSISSCDAPLNVFQSRGLRPEV 
ICPCASSPEALTVAIRRPPDLGGPPPRKAVLSMNGLSYGVIRVD 
TEEKLS VLTVQDVGQVM PGANVCWKLEGTP YLCKTDEVGE I CV 
SSSATGTAYYGLLGITKNVFEAVPVTTGGAPI FDRPFTRTGLLG 
FIG P DHLVF I VG KL DGLMVTG VRRHNADD WATALAVE PM K FVY 
RGR I AVFSVTVliWDDR T VLVAEDP PDAS E EDS POWMSR VtDZl. T n 
SIHQVGVYCIALVPANTLPKAPIiGGIHISETKQRFIiEGTLHPCN 
VLM CPHTCVTNL P KPRQKQ PE VGPAS MI VGNLVAGKR IAQASGR 
ELAHI»EDSDQARXFIiFLADVLQWRAHTTPDH PLFLLLNAKGTVT 
STATCVQLH KRAERVAAALMEKGR1S VGDHVALVYPP GVDLIAA 
FYGCL YCGCVP VTVRP PHPQNLGTTL PTVKM I VEVS KSACVLTT 
QAVTRLLRSKEAAAAVDIRTWPTILDTDDIPKKKIASVFRPPSP 
DVLAYLDPSVSTTG I LAG VKMSHAATSALCRS I KLQCELYPSRQ 
I AI CLDPYCGLGFALWCLCS VYSGHQS VLVP PLELESNVS LWLS 
AVSQYKARVTFCCYSVMEMCTKGLGAQTGVLRMKGVNLSCVRTC 
MWAEERP\RIALTQS FS KL FKDLGL P ARAVS TT FGCRVNVAI C 
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nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A^Alanine, ^Cysteine, DuAspartic Acid, E= 
Glutamic Acid, ^Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M»Methionine, NoAsparagine, 
P* Proline, Q«Glut amine, R=Arginine, 
S=Serine, T»Threonine, V-Valine, 
W*Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








WTAGPDPTTVYVDMRALRHDRVRLVBRGSPHSLPtMESGKIL 
PGVKVI IAHTETKGPLGDSHLGE I WVSS PHNATG YYTV YGEEAL 
HADHFSARLS FGDTQT I HARTG YLG FLRRTELTDASGGRHDAL Y 
WGSLDETLELRGMRYHPIDIETSVIRAHRSIAECAVFTWTNLL 
VVVVELDGLEQDALDLVALVTNVVLEEHYLVVGVVVI VDPGVI P 
INSRGEKQRMHLRDGFLADQLDPIYVAYNM 


6607 


1444 


606 


VGHDT VHAMFTCFPKCLGFS P PVNVTVS PR SEBSHTTT VSGGNG 
. S VFQAGPQLQALANLEARRGS 3 GAALSSRDVSGLP VYAQSGE P R 
RLTQAQVAAFPGENALEHSSDQDTWDSLRSPGPCSPLSSGGGAE 
SLPPGGPGHAEAGHLGKVCDFHLNHQQPSPTSVLPTEVAAPPLE 
KILSVDSVAVDCAYRTVPKPGPQPGPHGSLLTEGCLRSLSGDLN 
RFPCGMEVHSGQRELESVVAVGEAMA\LKFPMGAMSYCLRDRSR 
FLFRLPMGLSCPLQVQ 


6808 


2063 


737 


**** wirK»/u«ft«ji\ruftoiujoo KXvJvl tv«r Jvo wU^UKJjLnJ^lJ uHtnxj 

SRELSLYLEHQVRVGFFGSGVGLSLIIjGFSVAYAFYYLSS iakk 
PQLVTGGESFSRFLQDHCPVVTETYYPTVWCWEGRGQTLLRPF\ 
ITS KPPVQYRNEL I KTADGGQ ISLDWFDNDNSTCYMDAS TRPT I 
LLLPGLTGTSKESYILHMIHLSEELGYRCVVFNNRGVAGEaJLLT 
PRTYCC!ANTEDLETVIlUiVI^LYPSAPFIjAAGVSMGGMLI>LNYIi 

GKIGSKTPLMAAATFSVGWNTFACSESLEKPLNWLLFNYYTiTTC 
LOSSVNKHRHMVVXftvnMTYHVMTra ifiCTDQmvDGTcirMDnvfvnT 

DDYYTDAS PS PRL KSVG I PVLCLNS VDDVFS PSHAI P IETAKQN 

PNVALVLTSYGGHIGFLEGIWPRQSTYMDRVFKQFVQAMVEHGH 
ELS 


6809 


933 


45 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 
TDEAAQTDS QPLHPSDPTE KQQPKRLHVSNI PFR FRD PDLRQMF 
GQFGKI LDVE 1 1 FNERGS KGFGFVTFETSSDAJDRAREXLNGTI V 
EGR KIEVNNATARVMTNinCTRNP VTUntOTCT.rTD wri Birvr- OT7T7V jv 

VTGFPYPTTGTAVAYRGAHLRGRGRAVYOTFRMPPPPPI PTYG 
AWYQDGFYGAE I \liEATQPTDTLS PLQRRQPTATVTAES TQLP 
TRT I TPSGPRRPTALE P CETFHRFLLGP r 


" 4810 


&9 


(& 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 
TDEAAQTDSQPLHPSDPTEKQQPKRLHVSNI PFRFRD PDLRQMF 
GQFGKI LDVE 1 1 FNERGS KGFGFVTFETSSDADRAREKLNGTIV 
EGRKI E VNNATAR VMTNKKTGNP YTNGWKTiWPVVT: Avvr: d tmtvt* 
VTGFPYPrTGTAVAYRGAHLRGRGRAViOTFRAAPPPPP IPTYG 
AWYQDGFYGAE I \ LEATQ PTDTLS PLQRRQPTATVTAES TQLP 
TRTITPSGPRRPTALEPCETFHRFLLGP 


6B11 


1*22 


658 


DLVTVWSFVDCRVIASTHGH\KSWVSVVAFDPYTTSVEEGDPME 
FSGSDEDFQDLLHFGRDRADSTQCRLSRRNSTDSRPVSVTYRFG 
SVGQDTQLCLWDLTODILFPHQPLSRARTHTNVMNATSPPAGSN 
GNSVTTPGNSVPPPLPRSNSLPHSAVSNAGSKSSVMDGAIASGV 
SKFATLSLHDRKERHHEKDHKRNHSMGHISSKSSDKLNLVTKTK 
TDPAKTLGTPLCPRMEDVPLLEPLI CKKI AHERLTVLI FLEDCI 
VTACQEGFICTWGRPGKWSFNP 


6812 " 


4001 


1482 

i 


EDAVFSLDI^TI IQGTWFLNGEELKSNEPEGQVEPGALRYRIEQ 
KGLQHRLILHAVKHQDSGALVGFSCPGVQDSAALTIQESPVHIL 
SPQDKVSLTFTTSERWLTCELSRVDFPATW YK0GQKVEES ELL 
VVKMDGRKHRLILPEAKVQDSGEFECRTEGVSAFFGVTVQDPPV 
HIVDPREHVFVHAITSECVMLACEV\DR\EDAPVRWYKDGQEVE 
ESDFVVLENEGPHRRLVLPATQPSDGGEFQCVAGDECAYFTVTI 
TDVSSWIVYPSGKVYVAAWLERWLTCELCRPWAEVRWTKDGE 
EWESPALLLQKEDTVRRLVLPAVQLEDSGEYLCEIDDESASFT 
VTVTEPP VRI IYPRDEVTLIAVTLECVVLMCELSREDAPVRWYK 
DGLE VEES EALVLERDGPRCRLVLPAAQPEDG GE FVCDAGDDSA 
FFTVTVTEPPVQFIALETTPSPLCVAPGEPVVLSCELSRAGAPV 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D=Aspartic Acid, B= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H-Histidine, I=lBoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P= Proline, QoGlutamine, R=Arginine, 

WoTryptophan, Y~Tyrosine, X=0nknown, *=>Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








VWSHNGRPVQEGEGLELHAEGPRRVLCIQAAGPAHAGLYTCQSG 
AAPGAPS LS FTVQVAE P PVR WAP EAAQTRVRS TPGGDLE LWH 
LSGPGGP VR WYKDGERLAS QGR VQLEQ AGAR Q VLR VQGARS GDA 
\3B» i Jb UU AVy IJia X 1 c Lt Vb V E fc» fc* Jj Jb V /Lb VaDLTPLTVHEGDDATFR 

CEVS P PDADVTW LRNGA WT PG PQRQS CCSYGGCRMCGQRKART 
CVSKWRQAEWVQRGPCAGCBVGSPCPTTLACPWPRMOTSTASSS 
MVS Y W PTRAPTAARATT IAPWFGS A 


6813 


9 


836 


SSTQQRPGVPAGPRPIiDGYIiGVADHKPLKMHCRDCALVTSSGHI* 
LHSRQGSQIDQTECVIRMNDAPTRGYGRDVGNRTSLRVIAHSSI 
QRILRNRHDLLNVSQGTVFIFWGPSSYMRRDGKGQVYWNLHIiLS 
QVLPRLKAFMITRHKMLQ FDELFKQETGQ\NRKISNTWLS TGWF 
TMTIALELCDRINVYGMGPPDFCRDPNHPSVPYHYYBPFGPDEC 
TM YLSHERGR KGSHHRF ITEKRVFKNwARTFNI HFFQPDWKPBS 
LAINHPENKPVF 


6814 


3 


737 


KFRRQEAN/ AREKNRMHGLNDALDNXiRKVVPCYS KTQKLS KIET 
LRLAKNYIWALSEILRIGKRPDLLTFVQNLCKGLSQPTTNLVAG 
CLQLNARS FLMGQGGEAAHHTRS P YSTFYPP YHS PELTTP PGHG 
TLDNS KSMKPYNYCSAYES FYESTS PECASPQFEGPLSPP PINY 
NG I FS LKQEETLD YG KNYN YG MH YCAVPPRGPLGQGAMFRIiP TD 

CTJI7DVF1T f-TT DQftOT T»MrtlMI»T XTfetfUTJXT 


6815 


306 


553 


QGLDPASQTKWEIjLKDGSGRRGDRRSSRDMAGGAGPRSE SDLE 
DVGPTAEWNGDG SGSLRRSGS FGKLRDALRRS SEMLVKKLQGGT 
PQEPPNPRMKRASSLNFLNKSVEEPTQPGG 


6816 


L 


803 


NLLKTHKF \LLGQDEDSLHS VP VAOMGNYQE YLKTIiAS PLRE 1 D 
PDQPKRLHTFGNPFKQDKKGMMIDEADEFVAG PQNKVKRPGEPN 
SPMSSKRRJISMSLLLRKPQTPPWnraVGGKGPPSASWFPSYPN 
LI KPTL VHTDAT 1 1 HDGHEEKMENGQ I TPDGFLS KS APS ELI NM 
TGDLMPPNQVDSLSDDFTSLSKDGLIQKPGSNAFVGGAKNCSLS 
VDDQKDPVASTLGAMPNTLQITPAMAQGINADIKHQLMKEVRKF 
GRSK 


6817 


r 172 


34**7 


MaMDSPKIGfeK^VIGPGTDIGISSI^ 
DEYCPACKEKGKLKALKTYRI S FQES I FLCEDLQCI YPLGS KSL 
NNLISPDLEECHTPHKPQKRKSLESSYKDSLLLANSKKTRNYIA 
I DGGKVLNS KHNGEVYDETS S NL PDSSGQQNP I RTADSLERNE I 
LEAOTVD^TTKDPATVDVSGTGRPSPQNEGCTSKLEMPLESKC 
TS FPQALCVQWKNAYALCVTLDCI LSALVHS EEL KNTVTGLCS KE 
ES IFWRLLTKYNQANTLLYTSQLSGVKDGDCKKLTSKIFAEIET 
CLNEVRDEIFISLQPQLRCTLGDMES PVFAFPLLLKLETHIEKL 
FLYS FS WDFE CSQ CGHQYQNRHMKS LVT FTNV I PEWHPLNAAHF 
GPCIWCNSKSQIRKMVLEKVSPIFMLHFVEGLPQNDLQHYAFHF 
EGCLYQITSVIQYRANNHFITWILDADGSWLECDDLKGPC9ERH 
KKFEVPAS E IH I VI WERKI SQVTDKEAACLPLKKTNDQHALSNE 
KP VS LTS CS VG DAAS AETAS VTH PKD I SVAP R T L S QDTAVTHGD 
HLLSGPKGLVDNILPLTLEETIQKTASVSQLNSEAFL\LENKPV 
At»N iiKXn 1 IjIjSUISSIiMASo vsafcnekliqdqfvdisfpsq 
VVNTNMQS VQIiNTEDTVNTKS VNNTDATGLIQGVKS VE IE KDAQ 
LKQFLTP KTEQ LKPERVTS Q VSNLKKKETTADS QTTTSKS LQNQ 
SLKENQKKPFVGSWVKGLISRGASFMPLCVSAHNRNTITDLQPS 
VKG VNNFGG FKTKG I NQKAS HVS KKARKS AS KP P PI S KPPAGPP 
SSNGTAAHPHAHAASEVLEKSGSTSCGAQLNHSSYGNGISSANH 
EDLVEGQIHKLRLKLRKKLKAEKKKLAALMSSPQSRTVRSEKLE 
QVPQDGS PNDCES IEDLLNELPYP IDIANESACTTVPGVSLYSS 
QTHE B I LAE LLS PTP VSTELS ENGEGD FRYLGMG DSHI PPPVPS 
EFNDVS QNTHLRQDHNYCS PTKKNPCEVQPDSLTNNACVRTLNL 
ES PMKTDI FDEFFS S S ALNALANDTLD LPHFDE YLFENY 


6818 


2 


24 0 


RGFDKVLWT/LSGAVK\CVQFSRISPDGEEGYPGELKVWVTYTL 
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Amino acid segment containing signal peptide 
(A«Alanine, O Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, X»Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, RsArginine, 
S=Serine, TsThreonina, VaValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *«Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








DGGE/ LHS /ATTEHKP/ VQATPVNLT\TI LTSTWQARLPQ1 


6819 


1 


961 


GIPCTB^NFT>NANVTGEIEFAIHYCPKTHSLBICIKACKNLAY 
GEEKKKKCNPYVKTYLLPDRSSQGKRKTGVQRNTVDPTFQETLK 
YQVAPAQLVTRQLQVSVWHLGTLARRVPLGEVI 1PLATWDFEDS 
TTQSFRWHPLRAKADKYEDSVPQSNGELTVRAKLVLPSRPRKLQ 
EAQEGTIXJPSLHGQLCLVVLGAKNLPVRPDGTLNSFVKGCIiTLP 
DQQKLRLKS PVLR KQ AC PQWKHS FVFSGVT PAQLRQS S LELTVW 
DQALFGMNDRLLGGT\RLGSKGDTAVGGnACSQSKLQWQKVLSS 
PNLWTDMTLVLH • 


6820 


1014 


340 


GDMVYXVGHVPPGFFEKTQNKAWFREibFjNBKYLkVVRKHHRVlA 
GQFFGHHHTDS FRMLYDDAG VP I S AMFI TPG VTPWKTTLPG VVN 
GANNPAIRVFEYDRATLSLKDMVTYFMNI^QANAQGTPRWELEY 
QLTEAYGVPDASAHSMHTVLDRIAGDQ9TLQRYYVYNSVSYSAG 
VCDEACSMQHVCAMRQVD I DAYTTCL YASGTTP VPQLP LLIjMAL 
LGLCT 


6621 


1088 


518 


EFDIYR/BVGGEFVPVTRDDSSNGFPRTQHGPSPTVriPiQSPQN 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
. FFAFSLIEGYI\SIVMDAETQKKFPSDLLLTSSSGELWRMVRIG 
GQPLGFDBCGIVAQ IAGPLAAAD I S AYYISTFNFOHAIiVPEDGT 
GS V I E VLQRRQEGLA8 


$622 


1088 


518 


EFDI YR/EVGGEFVPVTRDDSSNGFPRTQHGPSPTVHPIQS PQN 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAFS 1*1 EG Y I \S I VMDAETQ KKF P S DLLLTSS SGELWRM VRIG 
GQPLGFDECGIVAQIAGPLAAADISAYYISTFNFDHALVPEDG1 
GS VI EVLQRRQEGLAS 


" 6B23 


654 


221 


PPKIiLSRWARMGHGDBIV\LSDLNFPGLLHLPWGPWRSVQTAG 
GIPQLLEAVLKLLPLDTYVESPAAVMELVPSDKERGLQTPVWTE 
YESILRRAGCVRAIiAKIERJEPYERAKKAFAWATGETALYGNL 
ILRKGVLALNPLL 


6S24 


858 


104 


LLLAQR WGWG \ CCFFS LAVS VKMNVLL FAPGiiLFl*LLTQFGFRG 
ALPKLGICAGLQVVI^LPFLLENPSGYLSRSFDLGRQFliFHWTV 
NWRFLPEALFIiHRAFHLALLT7\HLTLLIjLFALCRWHRTGES I LS 
LLRPPSKRKVPPQPLTPNQIVSTLFTSNFIGICFSRSLHYQFYV 
WYFHTLP YLLWAMPARWLTHLLRLLVLGIjIELS WNTYPSTS CSS 
AALHI CHAVILLQIjWLGPQPFPKSTQHSKKAH 


6825 


3 


1173 


SSGEFGLQASDIMWTISDTGWILIILCSIiMEPWALGACTFVHLL 
PKFD PLVI LKTLS S YP I KSMMGAPI VYRMLLQQDLS S Y KF PHLQ 
NCLAGGES LLPETLE NWRAQTGLD I RE FYGQTETGLTCMVS KTM 
KIKPGYMGTAASCYDVQI I DDKGNVL PPGTEGDIG IRVKP IRPI 
GIFSGYVDNPDKTAANIRGDFWLLGDRGIKDBDGYFQFMGRADD 
IINSSGYRIGPSEVENALMEHPAWETAVISSPDPVRGEWKAF 
VILAliQFLSHDPEQLTKELQQHVKSVTAPYKYPRKIEFVLNLPK 
TVTGKIQRA\KLRDKEWKMSGKAPCAVRHLRDIHLDSPLLSLSF 
PFGPIiAIiPMIXSYGDSLWEEHEYKFCIiALVISTKLYHVRC 


6826 


2304 


954 


LKTES F KP W/ VNI AIiAFHLLGERASPNS FWQP YIQTLPRE YDTP 
LYFEEDEVRYLQSTQAIHDVFSQYKNTARQYAYFYKVIQTHPHA 
NKLPLKDS FT YED YRWAVSS VMTRQNQ 1 PTEDGSRVTLAL I P LW 
DMCNHTNG L I TTG YNLEDDR CE CVALQDFRAGEQ I YI FYGTRSN 
AEFVTHSGFFFDNNSHDRVKIKLGVSKSDRLYAMKAEVLARAGI 
PTSSVFALHFTEPP I SAQLIiAFLRVFCMTEEELKEHLLGDSAID 
RI FTLGNSEFP VS W DNEVKLWTFLEDRAS LLLKTYKTTIE SDKS 
VLKNHDLSVRAKMAIKLRLGEra^ 

EKAPLPKYEESNLGLLESSVGDSRLPLVLRNLEEEAGVQDAIiNI 
REAI S KAKATENG L VNGENS I PNGTRS ENES LNQES KRAVEDAK 
GSSSDSTAGVKE 
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beginning 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(AoAlanine, C=Cysteine, D=Aspartic Acid, Bo 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
HoHistidine, I=Isoleucine, KaLysine, 
LaLeucine, M=Methioniae, N=Asparagine, 
Pa Proline, Q=Glutamine, RaArginine, 
S=serine, T=Threonine, v=Valine, 
WaTryptophan, YsTyrosine, XaOnknown, *«Stop 
Codon, /opossible nucleotide deletion, 
\«=possible nucleotide insertion) 


<*827 


1 


779 


SSVVEPGL3VLGGLFLLFVLENMLGLI*RHRGLRPRCCRRKRRNL 
ETRNLDPENGSGMAliQPLQAAPEPGAQGQREKNSQHPPAtiAPPG 
HQGHSHGHQGGTDITWMVLLGDGIiHNLTDGLAIGAAFSDGFSSG 
L5TTLAVFCHBLPHE LG DFAMLLQSGLS FRRLLLLS L VS GALG L 
GGAVLGVGLS LG P VP LT PWVFGVTAGVPL YVALVDML P AL PPS S 
GAPAYA\HVLLQGLGLLLGGCLMLAITLLEElUiLFVTTEG 


6828 


3 


1654 


KSQHG/WILQLMHSCKEGYVKDLKGNPGIiHRAMIiDLDNGTRPSE 
LGHLSQTASLXRGS S FQSGRDDTWRYKTPHRVAFVEKLTKLVIiS 
QL PNFWKLW Z S YVNGSLFSE TAEKSGQI ERS KNVRQRQND PKKM 
I QEVMHS LVKLTRGALLPLS I RDGEAKQYGGWEVKCELSGQWLA 
RAIQTVRLTHES LTALE I PNDLLQTI QDLILDLRVRCVMA^TjQH 
TAEEIKRLAEKEDW1VDNEGLTSLPCQFBQCIVCSLQSLKGVLE 
CKPGEAS VPQQPKTQEEVOQLS INTIMQVFI YCLEQLSTKPDADI 
DTTHLSVDVSSPDIiFGS IHEDFSLTSEQRLLIVLSNCCYLBRHT 
FLNIAEBFEKHNFQGIEKITQVSMASLKBLDQRLFENYIELKAD 
P I VGSLE PG I YAG Y FD W KD CLP P TG VRNY L KEAL VN I IAVHAEV 
FTISKELVPRVtiSKVIEAVSEELSRLMQCVSSFSKNGAIjQARl>K 
ICALRDTVAVYTjTPBSKSSFKQALEALPQLSSGADKKLLEELLN 
i KFKSSMHLQLTCFQAASSTMMKT 


6B29 


1 


782 


MRMEAGEAAPPAGAGGRAAGGWGKWVRLNVGGTVFLTTRQTliCR 
BQKSFLSRLCQGEELQSDRDETGAYLIDRDPTYFGPILITFDRHG 
KLVLDKDMAEEGVLEEAE FYNI GPLIRII KDRMEEKDYTVTQVP 
PKHVYRVLQCQEF^LTQMVSTMSDGWRFEQLVNIGSSYNYGSED 
QAEFLCWSKELHSTPNGIjSSESSRKTKSTEEQI^EQQQQBEEV 
EE VEVEQVQ VEADAQE K / CCYKPEAPGCEAPDHLQGIiG VP I 


6830 


1 


939 


MBPGSVENLS IVYRSRDFLWNKHWDVR IDS KAWRETLtfLQKQL 
RYR F PELADPDTCYGFRFCMQLDFSTSGALCVAIiNKAAAGS AYR 
CFKERRVTKAYLALLRGHIQESRVTISHAIGRNSTEGRAHTMCI 
EG SQGCENP KPSLTDL WLEHGLYAGDP VSKVLLKPLTGRTHQL 
RV\HCSAIX3HPVVGDLTYGEVSGREDRPFRMMLHAFYLRI PTDT 
EC VE VCTPDPFLPSIjDACWS phtllqs ldqlvqalrat PDPDPE 

DRGPRPGSPSAliLPGPGRPPPPPTKPPETEAQRGPCLQWLSEWT 
LEPDS 


6831 


3 


1087 


SLFFGSSTPDNKVAEQEDIiETQPSPSVEKAVTVIDPEGTIPTNF 
NVAEKPADHSLSEVKLKTADBPRGTLVKSGDGQNVKEK5MI LSN 
VEDLQQPKFISEVSREDYGKKEISGDSEEMNINSWTSADGENL 
EIQSY9LIGEKLVMEEAKTIVPPHVTDSKRVQKPAIAPPSKWNI 
SIFKEEPRSDQKQKSLLSFDWDKVPQQPKSASSNFASKWITKE 
SEKPESIILPVEESKGSLIDFSEDRLKKEMQNPTSLKISEEETK 
LRSVS PTEKKDNIiENR\ SYTL\AEKKVLAEKQNSV\APLELRDS 
NEIGKTQITLGSRSTELKESKADAMPQHFYQNEDYNERPKI IVG 
SEKEKDEKKKK 




1809 


412 


MGSGLISGPPQDNSGEALKEPERAQEHSLPNFAGGQHFFEYLLV 
VSLKKKRSEDD YE P 1 1 TYQF PKRENLXiRGQQEEEERLLKA I PliF 
CFPDGNEWASLTEYPRETFSFVLTNVDGSRKIGYCRRZjLPAGPG 

PRIiPKWf , TT.Qf , T^r , W/2T.WGTTTT PlTVCVDuPiT PIluti trr>ot«»-\r-iT 
rrvur- ivviciio LluLrubro 1\,± LiUa V £■ ilKnQ i § MAVX 3t P FMQGLj 

REAAFPAPGKTVTLKSFI PDSGTEFISLTRPLDSHLEHVDFSSL 
LHCLSFEQILQI FASAVLERXII FLAEGLSTLSQCIHAAAALLY 
PFSMAHTYIPWPESLIATVCCPTPFMVGVQMRFQQEVMDSPME 
E VLLVNLCEGTFLMS VGDE KD I LP P KLQDDILDS LGQGIN ELKT 
AEQINEHVSGPFVQFFVKXVGHYASYIKREANGQGHFQERSFCK 
ALTSKTNRRFVKKFVKTQLFSLFIQEAEKSKNPPAGYFQQKIliE 
YEEQ KKQ/ TETKGKNCE IRAWNKND 


6833 


1 


1129 


PLMTLSQCGG I PGHG HSHG GHGHGHGL P KGPR VKSTRPGS 5» D I Nf 
VAPGEQGPDQEETNTLVAOTSNSNGLKlxDPADPENPRSGDTVEV 
QVNGNLVPJSPDH^LEEDPJVGQLNMRGVFLH 
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corresponding 
to first 
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amino acid 
sequence 


Predicted end 
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location 
corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, CaCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=»Glycine, 
H=Histidine, Ioisoleucine, K=Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine RaArainine 
S*Serine, T-Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X-Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NALVFYFSWKGCSBGDFCVNPCFPDPCKAFVEIINSTHASVYEA 
GP CWVLYLDPTLCVVMVC I LL YTTYPLLKESAL ILLQTVPKQI D 
I RNLIKELHNVEGVEEVHELHVWQLAGSRI IATAHIKCEDPTS Y 
MEVAKTI KDVFHNHG IHATT IQPEPASVGS KS SW PCELACRTQ 

AEN1PA\WIEIKN\IPNK\QPESSL 


6834 


78 


1151 


AGQERPAPIWRIiLWLPTPSVSRKAEPAHIPINR*GA*E*RGGLP 
LCGSSASAYGWH* RLTPWSPGGS * HM* SSKAPVTQARE VLVAGP 
CSKLVLSGARGIVGTTVQVLVEAQQPLLLLFTGVWGLNLRAGEE 
SI^*LIBEVTQVHDAHLGNAWGCAQCLSQGQVGSAIJVKALLE 
AAAAVRDCKEVLTVSGDKQQAEVS VRL * VRDVCVEEAGCVE PGQ 
•tt-mjtt ±Aj jj/Uj/iAOKoU THIS vSeQ VQ VDGVQKL VLSAHE CHELVAG 
QQDGEDQAARTRLLQAGAHSVAHGRRQGQAPCRPHQEAGVS CHE 
LQQWGDAXi * ARE * APQ 1 I VLLLLEDVAQLRTGKKA* DLVVDVE 
QLLRQL 


£835 


1 


834 


GIPAADR\EASLELIK^DISRTFPNLCIPQQGGPYHJDMLliSlLG 
AYTCYRPDVGYVQGMS FIAAVLI LNLDTADAF I AFSNLLNKP CQ 
MAFFRVDHGLMLTY FAAFE VPFE ENLPKLPAHFKKNNLT PDI YL 
I DW 1 FTL YSKSLPl»DIiACRI WD VFCRDGEEFLFRTALGI I/XLFE 
DILTKMDFIHMAQFLTRIiPEDLPAEELFASIATIQMQSRNKKWA 
QVLTALQKDS REMRBGKS VPPTLRLQRE FALGTNQS PMPRPLCC 
FRLTPGQPRRTDAIi 


6836 


1 


850 


MSCGRPPPDVDGMITLKV\DNLTVRTSPDSLRRVFfiKVGRVGDV~ 

YIPREP.HTKAPRGFAFVRFHDRRDAQDAEAAMDGAELDGRELRV 

QVARYGRRDLPRSRQGRRHAAGPEAA/RYGRRSRSYGRRSRSPR 

RRHRSRSRGPSCSRSRSRSRYRGSRYSRSPYSRSPYSRSRYSRS 

P YSRSRYRESR YGG S H YSSSG YSNS R YSRYHSSRSHS KSGS S TS 

SRSASTSKSSSARRSKSSSVSRSRSRSRSSSMTRSPPRVSKRKS 

KSRSRSKRPPKSPEEEGQMSS 


6837 


1 


1369 


i uuHAViUiNFGS D YFPGGTAP /GGPRTRRP \ SGTSS SGS KASGP 
PNP PAQGDGTSLS PNYTLESTSGNDGKPVSGGGGRGRGRRKRDS 
GHVS PGTFFDKYSAAPDSGGAPGVS pgqqoasgaavggs saget 
RGAPTPHEKALTS PSWGKGAELLLGDQPDLIGS LDGGAKSDSSS 
r« vuorAoufiivo ioXAWoUisVSSSSDNPQALVKASRSPLVTGSP 
KLPPRGVGAGEHGPKAPPPALGLGIMSNSTSTPDSYGGGGGPGH 
PGTPGLEQVRTPTSSSGAPPPDEIHPIjEILQAQIQLQRQQFSIS 

edqplglkggkkgecavgasgagngdselgsccs eavksamsti 
dlds lmaehs aawympadkalvdsadddktlap wbkakpqnpns 
keahdlp ankas as q pg s hlqcls vhctddvgdakaras vp twr 
slrs di snrfgtpvaalt 


6838 


16 


499 


bTDTPPPKTHMIHHSISDYKATLRCWALGFYPMEITLTWQQDEE " 
DQTRDMEIiVETRPAGDGTFQKWAAVWPSGEE/Q/RYMCHVQHE 
GLPE PLTLRWEQSS QPT 1 P I VGI VAGLVLLGAVVTGAVVSAVMC 
RKKNS DRVSYS EAAS s dhaot; t >rh nrv 


6839 


1 


1195 


AAPAGGGPDPEAI^AFPGRHLSGLSWPQVKRLDALLSEPIPIHG ' 
RGNFPTLS VQPRQIRAGG PQHPGGAG \ IHVHRVRLHGSAASHVL 
HPESGLG YKDLDLVFRMDLRSEAS FQiTKAVVLACLLDFL PAGV 
SRAKITPLTLKEAYVQKLVKVCTDSDRWSLISLSNKSGKNVELK 
FVDSVRRQFEFS IDS FQI ILDSLLLFGQCSSTPMSEAFHPTVTG 
ESLYGDFTEALEHLRHRVI ATRS PEE I RGGGLLKYCHLLVRGFR 
PRPSTDVRALQRYMCSRFFIDFPDLVEQRRTLERYLEAHFGGAD 
AARR YACLVTLHRVVNE5TVCLMNHERRQTLDL I AALALQALAE 
QGPAATAALAWRP PGTDGVVP ATVNY YVTP VQPLLAHAYPTWLP 

as 


6840 


4254 


2061 ! 


ELQGDFS VPDVPKSMAWCENS ICVGFKRDYYLIRVDGXGS IKEL 
FPTGKQLEPLVAPLADGKVAVGQDDLTVVLNEEGICTQKCAIjNW 
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to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anano acid segment containing signal peptide"" 
(A- Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V*Valine, 
W-Tryptophan, Y^Tyrosine, X=Unknovm, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TDIPVAMEHQPPYIIAVLPRYVBIRTFEPRtLVQSIELORPRFI 
TSGGSNI I YVASNHFVWRLIPVPMATQIQQLIiQDKQPELALQLA 
EMKDDSDSEKQQQIHHIKNLYAPNLPCQKRFDESMQVFAKbGTD 
PTHVWSLYPDLIiPTDYRKQLQYPNPLPVLSGAELEKAHLALIDY 
+j+ vf'vnowjj v ruxuss LtourHjot) l o^JjMEGTPTIKSKKKLLQIIDTT 
LLKCYLHTNVAIiVAPLLRLENNHCHIEESEHVLKKAHKYSELI I 
LYEKKGLHEKAIX2VLVDQSKKANS PLKGHERTVQYLQHLGTENL 
HLI FS YSVWVLRDFPEDGLKI FTEDLPEVES LPRDRVLGFLIEN 
FKGLAI PYLEHIIHVWEETGSRFHNCLIQLYCEKVQGLMKEYIjL 
SFPAGKTPVPAGEEEGELGEYRQKLLMFLEISSYYDPGRLICDF 

pfdgli^erallixsrmgkheqalfiyvhilkdtrmaeeychkhy 

DRNKDGNKDVYLSLLRMYLSPPS I HCLGPI KLELLEPKANLQAA 
LQVLELHHSKLDTTKALNLLPANTQIND IRI FEjEKVLEENAQKK 
RFNQ VLKNLLHAEFLRV\QEER I LHQQVKC I ITEEKVCMVCKKK 
IGNSAFARYPNGVWHYFCS\KEVNPADT 


6841 


1 


3206 

T 


TPSTTGTKSNTPTSSVPSAAWPLNESLQPtGDYGOGSKNSKRA 
REKRDSRN^^?VQVTQEMRNVSIGMGSSDEWSDVQDIIDSTPELD 

mcpetrldrtgssptqgivnkafgintdslyhelstagsevigd 
vdegadlix3e fsgmgkevgnlll ensqllbt knalnvvkndli a 
kvdqi^geqevlrgeleaakqakvklenrikeleeelkrvksea 
i iarrepkeeaedvssylctesdki pmaqrrrftrvbmarvlme 
rnqykerlmelqeavrwtemirasrehpsvqekkkstiwqffsr 
lfss sss pppakrpypsgnihyks pttagfsqrrnhamcp i sag 
srpleffpdddctssarreqkreqyrqvrehvrnddgrlqacgm 

SLPAKYKQLSPNGC5QEDTRMKNVPVPVYCRPLVEKDPTMKLWCA 
AQVNLSGWRPNEDDAGNG VKPAPGRD PLTCDREGDGEPKSAHTS 
PE KKKAKELPEMDATS S RVW I LTS TLTTS KW 1 1 DANQPG TWD 
QFTVCNAHVLCI S3 1 PAASDSDYPPGEMFLDSDVNPEDPGADGV 
LAG I TLVGCATRCNVPRSNCSSRGDTPVIjDKGQGEVATI ANGKV 
wi-5jya X KKATEATE VPDPGPSEPETATLRPGPLTEHVFTDPAPT 
PSSGPQpGSENGPEPDSSSTRPEPEPSGDPTQAGSSAAPTMWIiG 
AQNGWL YVHSAVANWKKC1.HS I KLKDS VLSL VHVKGRVLVALAD 
GTCAI FHRGEDGQWDLSNYHLMDLGHPHHSIRCMAWYDRVWCG 
YKNKVHVTQPKTMQIEKS FDAHPRRESQVRQLAWI GDGVWVS IR 
LDSTLRLYHAHTHQHLQDVDI EPYVSKMLGTGKLGFS FVR ITAL 
LVAGSRLVWGTGNGWISIPLTETVVLHRGQ\LLG\1«RANKTSP 
\x invjy \UUaautU\PiRSr IPYCSMAQAQLCFH 
GHRDAVKFFVSVPGNVLATIiNGSVLDSPAEGPQPAAPASEVEGQ 

KLRNVLVLSGGEGYIDFRIGDGEDDETEEGAGDMSQVKPVI.SKA 
ERSH I I VWQVS YTPE 


6842 


3 


926" 


K{JQQIjS ATIIiTDHQYLERTPLCAXLKQKAPQQYR I RAKLRS YKP 
RRLFQS VKLHCPKCHLLQEVPHEGDLDI IFQDGATKTPDVKLQN 
TS LYDS KI WTTKNQKGRJKVAVHFVKNNG ILPLSNE CLLL I EGGT 
LS E I CKLSNKFNS VI PVRSGHKDtiET.T .m J5 a PWT. t or»'T»T7iiiJw< 

^ 1 *** * r v *v>Jvjiigily Ufa t.tljLIL>u3r\lr F L)X IJVj X. Villi YGC 

KQWST* RSIQNLNSLVDKTSWIPSSVAEALGI VPLQYVFVMTFT 
LDDGTGVLEAYLMDSDKFFQ I PAS EVLMDDDLQKS VDMIMDMFC 
PPGIKI DAYPWLECFI KS YNVTNGTDNQ I CYQ I FDTTVAEDVI 


6843 


2 

244 ■■+ 


851 


NHRKVLSGAKKYECNECGKSFAYTSSLI KHRRIHTGERPYECSE" 
OSRSFAENSSLIKmiRVHTGERPYECVECGKSFRRSSSLLQHQR 
VHTRERP YBCSBCGKS FS LRSNL IHHQRVHTGERHE CGQCGKS F 
SRKSSLIIHJjRVHTGERPYECSDOGKSFAENSSLIKHLRVHTGE 
RPYECIDCGKSFRHSSSFRRHQRVHTGMRPYK*SKFWKFSCPGF 
LLLC3GQRVHTGSRCYECDKWGI FFS*NASFFT* KSAPTEEVPFE 
CNECEKAFS PIiSLVTTIFT 


6844 




642 


EHQLAGFELRKTQTSMSLGTTREKTDRVKSTAYLSPQELEDVFY 
QYDVKSE I YS FGI VL WE I ATGD I P FQGCNS BK I R KL VAVKRQQE 



561 



WO 01/53312 



PCTYUS00/34263 



SEQ 
ID 

NO: 


Predicted 
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amino acid 
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amino acid 
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Predicted end 
nucleotide 
location 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Ala nine, C«Cvsteine. DaAsuarhir ahh p_ 
Glutamic Acid, F= Phenyl alanine, G*Glycine, 
HnHistidine, I«Isoleucine, K=Lysine, 
L=Leucine. M=Methionine, N=»Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonlne, V^Valine, 
WsTryptophan, Y»Tyrosine, X«Unknown, *«Stop 
Codon, /opossible nucleotide deletion, 
\opos3ible nucleotide insertion) 








P LGEDCPS BLRE 1 1 DECRAHDP5 VRPSVDE I LKKbSTFS K * CI K 
I 


6B45 


3 

* 


1519 


VAVRDE CY WRH VF WDQDLWMLLF I LM CH PETARARLE YR H RTLD 

wvw s*v**\\4L% iaj i WWIAT *i n aoHU to IxIj E» v ^« F J5 U JL i G V QB VH VNGA V 

GLAFELYYHTTQDLQL FREAGG WD WRAVAE FWCSRVBWS PR BE 
KYHIiRGVMSPDBYHSGVNNSVYTNVLVQNSLRFAAAIiAQDLGLP 
IPSQWLAVADKIKVPFDVBQNPHPEFDGYEPGBWKQADWLLG 
YPVPFSLSPDVRRKNLEIYEAVTSPQGPAMTWSMFAVGWMELKD 
AVRARGLLDRS FANP4AE PFKVWTENADGSGAVNFLTGMGG FLQA 
WFGCTGFRVTRAGVTFDPVCLSGISRVS VSGI FYQGNKLNFS F 
SEDSVTVBVTARAGPWAPHLEAELWPSQSRLSLLPGHKVSFPRS 
MijKAynoi^i'j^PUdbSShiFPaRTF SDVRDPLQS PLWVTliGSSSP 
TES LTVDPASE * SGTGASETSLGPSLWPRLHPPLLGTLLACHPS 
PAARLSGKVHAAWPEFKAFCL 


6846 


213 


1258 


LYFLKTIK* LNRLAEHP * YENEXLTKLRNTIMBQYTRTEESARG 
1 1 F TKTRQS AYALSQW I TENEK F AE VGVKAHHL I G AGHS S E FKP 
MTQNEQKEVISKFRTGKI NLLIATTVAEEGLD IKECNIVIRYGI* 
VTNEIAMVQARGRARADESTYVLVAHSGSGVIEHETVNDFREKM 
MYKAI HCVQNMKPEE YAHKI LELQMQS IMEKKMKTKRNI AKHYK 
NNPSLlTFLCWffCSVLACSGEDIHVIEKMHHVNMTPEFKEliYIV 

PTTUTf^T -f^ tftf ^ TL ^T_T]*\ T TlTj— t u T T J~nr j~kJ~t j-lIl r.i/uiiL > ■_ ■ i — ^ 

Ki^ivl^KKUADYPINGEIICKCQQAWGTMM^ 
NTVVVFKNNSTKKQYKKWVELPITFPNLDYSECCLFSDED 


6847 


1450 


348 


SMCWNSDRLEMPLlDLALILYPPSYVPYTGHLSDDSLSRKYCLTr^ 
WFEDALNGVL*RAEAIQPHCVNAGDRMEKFRQKYVJNKLOTIiRQQ 
PFAYGTLTVR^LLDTREHCLNEFNFPDPYSKVKQRENGVAliRCF 
PGWRS LDAKSWEERQIiALVKGLIiAGNVFDWGAKAVSAVLES DP 

IDI ILGVFPFVRELLLRGTEV1LACNSGPALNDVTHSESLIVAE 
RI AGMDP WHSALRJSERLLLVQTGS S S PCLDLS RLDKGLAALVR 
ERGADLWIEGMGRAVHTNYHAALRCESLKLAVIKNAWIiAERLG 
GRLFS VI FKYBVPAE 


6848 


19 


16 


AMWWNSl^IRNIVr^NPKKRNTI^IAMLKSLQSDILHDADSND~~ 

IRNHPVPVIAMVNGLATAAGCQLVASCDIAVASDKSSFATPGVN 
VGLFCSTPGVALARAVPRXVALEMLFTGEPISAQEALLHGLLNK 
WPEASLQEETMRIARKIASLSRPWSLGKATFYKQLPQDLGTA 
YYLTSQAI^^^ J ALRDGQEGITAFLQKRKPVWSHEPV*VEH 


6849 


70 


821 


SLGVDGSCLEQGSPAPRPQTDTSP*PVGiWAT<iQEDLYHQdYfeSc'~ 
VCVLFASVPDFKEFYSESNINHEGLECLRLLNEIIADFDEIiLSK 
PKFSGVEKIKTIGSTYMAATGLNATSGQDAQQDAERSCSHIiGTM 
VE FAVALGSKLDVTNKHS FNNFRLRVGLNHGP WAGVIGAQKPQ 

YDIWGNTVNVA5RMESTGV1/3KIGVTEETAWALQSIjGYTCYSRG 
vt kvkgkg0lctyflntdiitrtc5p psatit! 


6850 


2 


1235 


ARGLNHEWTFEKLRQHISRNAQDKQEIjHIiFMLSGVPDAVFDLTD "* 

LDVL KLELIPRAKI pakisqmtnlqelhlchcpaicveqtafs fl 

RDHLRCLHVKFTDVAEIPAWVYLLKNLRELYLIGNLNSENNKMI 
GI^SLRBLRHI^IUIVKSNLTKVPSNITDVAPHLTKLVIHNDGT 
KLLVIiNSLKKMMNVAELELQNCELERI PHAI FSLSNLQELDLKS 
NNIRTI EEI IS FQHLKRLTCLKLWHNK I VTI PPS ITHVKNLESL 
YFSNNKLESLP VAVFSLQKLRCLDVS YNN I SMI P IEIGLLQNLQ 
HLHITGNKVDI LP KQLFKC IKLRTLNLGQNCITSLPEKVGQLSQ 
LTQLELKGNCLDRLPAQLGQCRMLKKSGLWEDHLFDTLPLEVK 
EALNQDINIPFANGI 


6851—" 


1765 


660 


VS AQ VS AREGENCLGWNLADS S QES YKSLEEAED CYPPSLLTLD ~ 
LRDLFNQVEQGPLLS CPKAGTDLSMGRAREVGWMAAGUMIGAGA 
CYCVYKLTIGRDDSEKLEEEGEEEWDDDQELDEEEPDIWFDFET 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence » 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline. 0=Glut amine r-zxt-ci ini no 
S= Serine, T=Threonine, V= Valine, 
WoTryptophan, Y=Tyrosine, X=Unknovn, *«Stop 
Codon, /-possible nucleotide deletion, 
\apossible nucleotide insertion) 








MARPWTEDGDWTEPGAPGGTEDRPS GGG KANRAHPI KQRP FPYE " 
HKNTWS AQNCKNGS CVLDLS KCL FI QGKLL FAE PKDAG FP FSQD 
INS HLA S LSMARNTS PTPDPTVRKALCAPDNLNAS IESQ6Q IKM 
YINEVCRETVSRCCNS PLOOAnT,wr,T.T<3M'nrTMMMT a vc a orvr v 

FPLISEGSGCAKVQVLKPLMGLSEKPVLAGELVGAQMLFSFMSL 
F I RNGNRB I LLETPAP 


6852 


1 


407 


RTRG E E TYAN F I KHN DG KNI FYAART P ATLFAVM F AMY I I S GLT ' 
GFIGLNSIAVLCNLVMGLALIFLCTWAYVKYSGEFRE1GTVIDQ 

TAETTtMP'nVT.TrOT/^nMT.MTPIPMTXJnc^fPKTeTWiM-tT mnm>r>f tit« ri-r 
- L - ri - Cj * **<n v u jve* iaj iMH XiviCi aw j. KSJo v 1 Mo I KAGIjTDQVS HHARL 

KTD 


6853 * 


3 


469 


QDSWCIELYKPNDLVRtLTO^tFi^^rCTOPltfUEHRTCPMC 
KCDILKALOIBVDVEDGSVSLQVPVSNEIFNSASSHEEDNRSET 
AS SG YASVQG TYEP PLBEHVQS TNE SLQLVNHBANS VAVD VI PH 
VDNPTFEBDETPNQETAVRE I KS 


6854 


1148 


585 


«Jsi X Aw lrDF<jJ:JXVCAAlQWLQDN5A5x 

PVKNTFfcRMWI YSHHI YQQDLRKKI LDVG KRLD VTGFCMTGKPG 
I ICVEGFKEHCEEFWHTIRYPNWKHISCKHAESVETEGNGEDLR 
LFHS FEEIJiLEAHGDYGLRiroYH^LGQFLEFLKKHKSEHVFQ I 
LFGIESKSSDS 


6855 


1913 


1148 


GRVGGRVGRICSPLSGANEYIASTDTLKTHiiVLLiFTDQTDDIjAK 
EEPTStiFQRDSETKGESGLVLEGDKE IHQIFEDLDKKLAIASRF 
Y I PEG CIQRWAAEMWAliDALHREG I VCRDLNPNNILLNDRGHI 
QLTYFSRWSEVEDSCDSDAIERMYCAPEVGAI TEETEACDWWSL 
GAVLFELLTGKTLVECHPAG INTHTTLNMPEWVS E EARS L I QQL 
ubniAMiu w /Us vai/i IwHfrc Tir V DWAEJjMR 


Ms " 


1617 


' 997 


VTQLYVSVDASTKDSLKKIDRPLFKDFWQQFLDSLKALAVKQQR * 
TVYRLTLVKAWNVDEIiQAYAQLVSLGNPDFIEVKGVTYCGESSA 
S SLTMAHVPWHEEWQFVRELVDLI PEYE IACEHEHSNCLLI AH 
RKFKIGGEWWTWINYNRFQEXjIQEYEDSGGSKTFSAKDYMARTP 
HWAIi FGAS ERGFDPKDTRHQRKNKS KAISGC 


6857 




OJL / 


K13 PEATAMV C V USHPNCRQNHIKPSHSAAQTW CG S PTPASAPNH 
KLMAMEQGKTL PS ATE DAKEEGLEAQ I S RLAE L IGRLESKALWF 
DLQQRLSDEDGTNMHLQLVRQBMAVCPEQLSEFLDSLRQYLRGT 
TGVRNCFHITAVRIiSDGFTFVIYEFWETEEAWKRHLQSPLCKAF 
RHVKVDTLSQPEALSRILVPAAWCTVGRD 


6858 " 


2 


669 


RSRGIKDFENDPPLSSCGIFQSRIAGDALIiDSGIRISSVFASPA " 
LRCVQTAKL I LEELKLEKKI KI RVE PG I FEWTKWEAGKTTPTLM 
SLEELKEANFMIDTDYRPAFPLSALMPAESYQEYMDRCTASt4VQ 
IVNTCPQDTGVILIVSHGSTLDSCTRPLLGLPPRECGDFAQLVR 

K I PS LGMCFCEEN5CEEQKWITLVNP DVVTT .top jv RMbTDKTU t o 

GN 


6859 


1 


1150 


GETMFKKAKTKAKKKPRKRSDSSGGYNLSDIIQSP'SSTGLLKSG " 

AKVKPYVNGTS PVYSREDLKPWEKS P I LKISAPQPI PSNRI DTT 
SSAS WVAGSFSPVSPPVVDLRTIMEIEESRQKCGATPKSHLGKT 
VSHGVKLSQKQRKMIALTTKENNSGMNSMErVLFTPSKAPKPVN 
AWASSLHSVSSKSFRDFLLEEKKSVTSHSSGDHVKKVSFKG I EN 
SQAPKIVRCSTHGTPGPEGNHISDLPLLDSPNPWLSSSVTAPSM 
VAP VTFAS I VEEELQQEAAL I RSREK PLALIQ IEEHAI QDLLVF 
YEAFGNPEEFVIVERTPQGPLAVPMWWKHGC 


6860 ■ 


1889 


1515 


DKDKKRQKKRGIFPKVATNIMRAWLFQHLTHPYPSEEQKKQLAQ 
DTGLT ILQVNNWF INARRI I VQPMIDQSNRAVSQGAAYSPEGQP 
MGS FVLIX3QQHMGIRPAGPMSGMCMNMGKDGQWHYM 


6BS1 


1889 


1515 ' 


DKDKKRQKKRGIFPKVATNIMRAWLFQHLTHPYPSEEQKKQLAQ 
DTGLT I LQVNNWF1NARR I IVQPMIDQSNRAVS QGAAYS PEGQP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
HeHistidine, I«Isoleucine, K=Lysine, 
LoLeucine, M=Methionine, N=Asparagine , 
P«Proline, Q=Glut amine, R=sArginine, 
S=Serine, T=Threonine, VWaline, 
WcTryptophan, Y=Tyrosine, XaDnknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








MGS FVLDGQQHMGIRPAGPMSGMGMNMGMDGQWHYM 




2 


471 


EEIDREFHNKLKLKEDKLEKQEKPVNGBDKGDSGVDTQNSEGNA 
DEEDPLGPNCYYDKTKS PFDNISCDDNRBRRPTWAEERRLNAET 
FG I PLRPNRGRGG YRGRGGLG FRGGRGRGGGRGGTFTAPRG FRG 
GFRGGRGGREFADFEYRKTTAFGP 


6863 


2216 


467 


PQ E PALKSE FS QVASNT I P LP LPQ PNTCKDNG PCKQ VCST VGGS 
AICSCFPGYAIMADGVSCEDQDECIiMGAHDCSRRQFCVNTLGSF 
YCVNHTVLCAIX3YILNAHRKCVDINECVTDLHTCSRGEHCVNTL 

KGSFYCQARQRCMDGFIjQDPEGNCVDINECTSL«EPCRPGFSCI 
NTVGSYTCQRNPLICARGYHASDLXJTKCVDVNECETGVHRCGEG 
QVCHNIiPGSYRCDCKAGFQRDAFGRGCIDVNECWASPGRLCQHT 
CENTLGSYRCSCASGFLLAADGKRCEDVNECEAQRCSQECANIY 
GSYQCYCRQGYQLAEDGHTCTD1DECAQGAGI LCTFRCLNVPGS 
YQ CACPEQG YTMTANGRS CKDVDB CALGTHNC S EAETCHN I QG S 
PR CLRFECP PNYVQ VS KTKCERTTCHDFIiEOQNS PARITHYQLN 
FQTGLLVPAHI FR IGPAPAFTGDT IALNI IKGNEEGY FGT RRLN 
AYTGVVYLQRAVLEPRDFALDVEMKLWRQGSVTTFLAKMHI FFT 
TFAL 


6864 


2 


2933 


LADSSPSNLQI 1 1 KELLSMHHQPDPALTKEFDYLPP VDSRSSSG 
FVG LRNGGAT C YMN A VFQQL YMQPG L P ES LL S VDDDTDN P DD S V 
FYQVQSLFGHLMESKLQYYVPENFWK1 FKMWNKELYVREQQDAY 
EFFTSLIDQMDEYLKKMGRDQIFKNTFQGIYSDQKICKDCPHRY 
EREEAFMALNLGVTSCQSLEISIiDQFVRGEVLEGSNAYYCEKCK 
EKR I TVKRTC I KSLPS VLVI HLMRFG FDWBSGRS I KYDEQ I RFP 
WMLNMEP YTVSGMARQDSS S E VGENGRS VDQGGGGS PRKKVALT 
ENYELVGVIVHSGQAHAGHYYSFIKDRRGCGKGKWYKFNDTVIE 
EFDLNDE TLE YECFGGBYR PKVYDQTNPYTDVRRRYWNAYMLF Y 
QRVSDQNSPVLPKKSRVSWRQEAEDLSLSAPSSPEI3PQSSPR 
PHRPNNDRLS ILTKLVKKGEKKGLFVEKMPARI YQMVRDENLKF 
MKjmDVYSSDYFSFVLSIASIiNATKLlOrPYYPCMAJCVSI^ 

SEGRELIKIFLLECNVREVRVAVATILEKTLDSALFYQDKI.KSL 
HQLLE VliLALLDKDVPBNCKNCAQY FFLFNTFVQKQG I RAGDLL 
LRHS ALRHM I S FL LGASRQNNQ IRR WS S AQAREFGNLHNTVALL. 
VLHSDVSSC3RNVAPGIFKQRPPISIAPSSPLLPLHEEVEALLFM 
S EGKPYLLEVMFALRELTGSLLALIEMVVYCCFCNEHFSFTMLH 
FIKNQLETAPPHELKHTFQLLHEILVIEDPIQVERVKFVFETEN 
GLLALMHHSNHVDSSRCYQCVKFLVTLAQKCPAAKEYFKENSHH 
WS WAVQ WLQKKMSEHYWTLQS NVS NETS TGKTFQRT I S AQDTLA 
YATALiLNEKEQSGSSNGSESS PANENGDKHLQQGSES PMMIGEL 
RSDLDDVDP 


6865 


1820 


1242 


DPERWKHLSKVTPPGSSVSTTPVQWRLQSPQSQGSMMPSCNRS 
CS CSRG PS VEDGKWYGVRS YLHLFYEGYAVPP KLEGIGEGBFLV 
LDQRAADYNQALGTCRLAGTALCVAAGVLLAICLFWAMIGWLSQ 
DTKAEPLDPEADSHVEVFGDEPEQQIiSPIFRNASGQSWFSPPAS 
PFGQSSVQTIQPKRDS 


6866 


1571 


495 


DCPRPRYTLYGLRATCMRDLDWAWINAVSAFKALEQDLPVNI KF 
I IEGMEEAGSVALEELVEKEKDRFFSGVDYIVISDNLWISQRKP 
AI T YGTRGNS YFMVE VKCRDQDFHSGTFGG ILHE PMADLVALLG 
SLVDSSGHILVPGIYDEWPLTEEEINTYKAIHLDLEEYRNSSR 
VEKFIiFDTKEEILMHLWRYPSLSIHGIEGAFDEPGTKTVIPGRV . 
I GKFS I RL VPHMNVSAVEKQVTRHLEDVFS KRNSSNKMWS MTL 
G LHP W I AN I DDTQYXiAAKRAIRTVFGTEPDH I RDGSTI P I AKMF 
QEIVHKS WLI PLGAVDDGEHSQWEKINRWNYIEGTKLFAAFFL 
EMAQLH 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


i Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide " 
(A- Alanine , C=Cysteine, D=Aspartic Acid, E=> 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, k=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=Threonine, V»Valine, 
WoTryptophan, Y=Tyrosine, X»Unknown, *«Stop 
Codon, /^possible nucleotide deletion 
\opossible nucleotide insertion) 


6967 


2633 


1704 


gtrimsqpkqkeiagfvrqkmLldysvymgrcvpqesrspqrsp" 
ywsdlvskkiqmkijskiki*pyfmnbltltbiidmgvavpkiloap 

KPYVDHQGLWIDLE^YNGSFLMTLETKMNLTKLGKEPLVEAIiK 
VGEIOKEGCRPRAFCLADSDEESSSAGSSEEDDAPEPSGQDKQL 
L PG AEG YVGGHRTS KIMRF VDKITKS KYFQKATETEFI KKK IEE s 
VSNTPLLLTVBVQECRGTIiAVNI P P PPTDRVWYGFRKPPHVEIiK 
ARPKLGEREVTLVHVTDWI EKKLEQEFQKVPVMPNMDDVYITIM 
HSAMDPRSTSCLLKDPPVEAADQP 


686B 


1 


346 ' 


RPTRPPTRPfiBIKNLII^YISD^FVQDtCEDPYELFKfDICGFb" 
KATFES QMS VMRGQ ILNLTQ AXiRDG KS P FQLVQ I P C V I VERSQG 
GSQGRIVHLSNSFTQTVNCRKPFPSSW 


*869 


3 


1619 


MYMBRMDKRALISFWESVEHLKNANKKTEIPQLVGEIYQNFFVES 
KE IS VE KSXYKE IQQCLVGNXG I EVFYKIQEDVYETLKDR YYPS 
r 1 VS L/JjiEKLLI KEBBKHASQMI SNKDEMGPRDEAGEE AVDDGT 
NQINEQASFAVNKLRELNEKLEYKRQALNSIQNAPKPDKKIVSK 
LKDEIILIEKERTDLQLHMARTDWWCENLGfWKASITSGEVTEE 
NGEQLPCYFVMVSLQEVGGVETKNWTVPKRLSEFHNLHRKLSEC 
VPSLKKDQLPSLSKLPPKS IDHTFMEKFENQLNKFLQNKLSDER 
LCQSEALYAFLS PSPDYLKVIDVQGKKNSFSLSS FbERLPRDFF 
SHQEEETEEDSDIjSDYGDDVDGRKDALAEPCFMLIGEI FELRGM . 
FKWVRRTIiIALVQVTFGRT I NKQIRDTVS WI FS EQMLVYYIN I F 
RDAFWPNGKLAPPTTIRSKEQSQETKQRAQQKLLENI PDMLQS L 
VGQQNARHGI IKI FNALQETRANKHLLYALMELLLIELCPELRV 
HtiDQLKAGQV 


6B7D 


1 


15*6 


MAAWAATRWWQLLLVLSAAGMGASGAPQPPNILLLLMDDMGWG 
DLG VYGEPS RETPNIiDRMAAEGLLF PNFYSANPLC3 PSRAALLT 
GRLPIRNGFYTTNAHARNAYTPQEIVGGIPDSEQLLPELLKKAG 
YVSKIVGKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARP 
NI P VYRDWEMVGRYYEEFP INL KTGEANLTQI YLQEALDF I KRQ 
ARHHPFFLYWAVDATHAP VYASKPFLGTSQRGRYGDAVRE I DDS 
IGKI LEIiLQDLHVADNTFVFFTSDNGAALIS APEQGGSNG P PLC 
GKQTTFEGGMREPALAWWPGHVTAGQVSHQLGS IMDLFTTSLAL 

QHKAHFWTWTNS WENFRQG Z DFCPGQNVSGVTTHNLEDHTKIiPIi 
IFHLGRDPGERFPLSFASAEYQEALSRITSWQQHQEALVPAQP 
QLNVCNWAVMNWAPPGCEKLGKCLTPPES I PKKCLWSH 


6871 


209 


1126 


QEVLQKAQQSGRS KCLKCGGSRMFYCYTCYVP VENVP IEQ I PLV 
KLPLKIDI I KHPNETDGKSTAIHAKLLAPEFVNI YTYPCIPE YE 
EKDHEVALIFPGPQSISIKDISFHLQKRIQNNVRGKNDDPDKPS 
FKRKRTEEQEFCDLNDSKCKGTTLKKIIFLDSTWNQTNKIFTDE 
RliQGLLQVEliKTRXTCFWRHQKGKPDTFLSTIEAIYYFLVDYHT 
DILKEKYRGQYDNLLFF YSFMYQLI KNAKCSGDKETGKLTH 


6872 


880 


459 


FGLLMWLSLI FMKGNC VREDLI FNFLFKLGLDVRETNGLFGNT 
KKLI TEy FVRQKYLE YRRI P YTEPAE YEFLWGPRAFtiE TS KML V 

LRFIjAKLHKKDPQSWPFHYLEALAECEWEDTDEDEPDTGDSAHG 
PTSRPPPR 


6873 ■ 


1929 


955 


DEQAVLCSKDKTYDLKIADTSNMLLFIPGCKTPDQLKKEDSHCN 
I IHTE I FGFSNNYWEIiRRRRPKLKKLKKLLMENP YEG PDSQKE K 
DSNSSKYTTEDLLDQIQASEEEIMTQLQVLNACKIGGYMRILEF 
DYEMKLLNHVTQLVDSESWSFGKVPLNTCLQELGPLEPEEMIEH 
CIJCCYGKKYVDEGEVYFELDADKI CRJiAARMLLQNAVKF^ 
QEVWQQSVPEGMVTSL^LKGLALVDRHSRPEIIFLLKVDDLPE 
DNQERFNS LFSLREKWTBED IAP YIQDLCGEKQT I GALLTK YS H 
S SMQNG VKVYNSRR PIS 
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SEQ 
ID 

wn • 
ww : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
HsHistidine, I=*Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine,. RoArginine, 
S=Serine, T=Threonine, V= Valine, 
W«Tryptophan, Y=Tyrosine, X«Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


6874 


1 


307 


dsiadhvnsaavtw:egtknix;kaakyki^pvagaliggmvg 

GPIGLLAGFKVAGIAAALGGGVLGFTGGKLIQRKKQKMMEKLTS 
SCPDLPSQTDKKCS 


6875 


1688 


349 


V1GTGERGNSASEKWBIMFNEELGDPFIIIHSISLLNAEEHSIA 
T LLLR I EK EE LDM KGSG PYVSLE WVTISKKNQDHKKYE 1 1 KRD I 
LRGKSVPHYAAIEPDGNGLMIVSyKSLTFVQAGQDLBENMDBDI 
SEKIIOilPLYYWQQTEDDIjTVTIRLPEDNTKEDIQIQFLPDHINI 
VWCDHQ FLEGKLYSS I DHE SS TW 1 1 KESNSLE I S LI KKN EGLTW 
PELVIGDKQGELIRDSAQCAAIAERLMHLTSEELNPNPDKEKPP 
CNAQELEECD IFFEESSSLCR PDGNTLKTTHVVNLGSNQ YLFS V 
IVDPKEMPCFCLRHDVDALLWQPHS SKQDDMWBHIATFNALGYV 
QASKRDKKFFACAPN YS YAAL CE CLRRVPI YRQPAPMSTVLYNR 
KEGRQVGQVAKQQVASIiETND P I LG FQATHERLFVLTTKNLFLI 
KVNTEN 


6876 


41 


1285 


VGEMTLIWRHLIjRPLCLVTSAPRILEMHPFLSLGTSRTSVTKLS~ 
LHTKPRMPPCDPMPERYQVIEXVNSGSEANEIiAMLMARAHSNNI 
DI ISFRGAYHGCSPYTLGLTNVGIYKMELPGGTGCQPTMCPDVF 
RGPWGGSHCRDSPVQTIRKCSCAPDCCQAKDQYIEQFKDTLSTS 
VAKS I AGF FAE P I QGVNG WQY P KG FLKEAFE L VRAR GG VC IAN 
E VQTGFGRLGS HFWGFQTHDVLPDI VTMAKG I GNG FPMAAVITT 
PEIAKSLAKCLQHFNTFGGNPMACAIGSAVLEVIKEENLQENSQ 
EVGTYMIiLKFAKLRDEFEIVGDVRGKGLMIGIEMVQDKISCRPL 
PREE VNQIHED CKHMGLLVGRGS I FSOTFR I APSMC ITKP EVDF 
AVEVFRSALTQHMERRAK 


6877 


1 


11B 


GTS PSPARAYAPPTERKRFYQNVSITQGEGGFEINLDHRKLKTP 
QAKLFTVPSEALAIAVATEWDSQQDTI KYYTMHLTTLCNT SLDN 
PTQRNRI)QIjIRAAVKFLDTDTICYRVEEPETLVEIiQRNEWDPII 
EWAE KRYGVE ISSSTSIMGPS IPAKTREVLVSHLAS YNTWALQG 
I EFVAAQLKSMVLTIjGL IDLRLTVEQAVLLSRLEEE YQ I Q KWGN 
IEWAHDYELQELRARTAAGTLFIHLCSESTTVKHKLLKE 


6878 


931 


263 


C^LC^DFKNRASMiDFNIRIKNVTRSDAfiKYRCEVSAPSEQGQN 

leedtvtlevlvapavpscevpssalsgtwelrcqdkegnpap 
eytwfkdgirllenppxgsqstnssytmotktgtlqfotvskld 

TGE YSCEARNS VGYRRCPGKRMQVDDLNISGI IAAWWALVIS 

vcglgvcyaqrkgyfsketsf^ksnssskattmsendfkhtksf 
II 


6879 


3 

• 


845 


IRVIGESDIMQEFLSESDENYNGVSDVELRVALPDGTTVTVRV^ 

knsttdc^qaiaakvgmdsttvnyfalfevishsfvrklapne 
fphklyiqnytsavpgtcltirkwlftteeeillndndlavtyf 
fhqavddvkkgy i kaeeks yqlqkl yeqrkmvmylnmlrt ceg y 
nei i fphcacdsrrkghvi tais ithfklhacteegqlenqvia 

FEWDEMQRWDTDEEGMAFCFEYARGEKKPRWVKI FTPYFNYMHE 
CFERVFCELKWRKEEY 


' 6880 ■ 


2110 


1437 


RKDNCTAKEWTFPEAKWNTTARVFSHIRLGMGriVLIIVQCFISS 
MAN I YNEKILKEGNQLTES I F IQNSKL YFFGI L FNGLTLGLQRS 

fJDnnT ITMPr3Ti , T< T Vf2'H15 21PQ\711T.TI7T7 T PXT?nr2T OinvWTT VOT nvrwrsTT 
nnuyiiuiuur r lunJUW a ViUiXr V lAfUurusVAr XLiilfLilJNMFH 

VIxMAQVTTVIITWSVLVFDFRPSLEFFLEAPSVLLSIFIYNAS 
KPQVPEYAPRQERIRDLSGNLWERSSGDGEELERLTKPKSDESD 
EDTF 


6881 


2638 


2244 


NDSKWEDIHVITCAI^FFREtPEPLKTFNHFND'FVNATKG^PR" 
QRVAAVKDL IRQLPKPNQDTMQILFRHLRRVIENGEKNRMTYQS 
IAIVFGPTLLKPEKETGNIAVHTVYQNQrVELILLELSSIFGR 


6882 


1 


B50 


GIPEAQIiWIYPVKSCKGVPVSEAECTAMGLRSGNLRDRFWLVIN 
QEGNMVTARQBPRLVLISIiTCDGDTLTLSAAYTKDLLLPIiCTPT 
TNAVHKCRVHGLE IEGRDCGEATAQWITS FLKSQPYRLVHFE PH 
MRPRRPHQIADLFRPKDQIAYSDTSPFLILSEASLADLNSRLEK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
tJUAlanine, C»Cysteine, D«Aspartic Acid, B*» 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=aAsparagine , 
PaProline, Q=Glutamine, R^Arginine, 
S=Serine, T^Threonine, VsValine, 
W=Tryptophan, Y» Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KVKATNFRPNI VI SG CDVYAEDS WDELL IGDVELFCR VMACSRCI 
LTTVDPDTGVMSRKEPLBTLKSYRQCDPSERKLYGKSPLFGQYF 
VLENPGTIKVGDPVYLLGQ 


6883 


2794 


2256 


nsklklnqnlklfitltyqvlslhgWgpg'ihi^kegafpvtqnr 

AI£XiLYDLRYLNI VLTAKGDBVKS GRSKPDSR IEKVTDHLEAL I 
DPPDLDVPTPHLNSNLHRLVQRTSVLFGLVTGTENQLAPRSSTF 
NS QE PHN I LPIiAS SQ I RFGLLPLS MTSTRKAKSTRNI ETKAQYD 
ANC 


6984 


2 


99 


EFERVTAEAVfCPRETSE PRAAAQR FCEKFPFL 


6885 


297 


1554 


STGQFWHVTDItHLDPTYHITDDHTKVCASSKGANASNPGPFGDV 
LCDSPYQLILSAFDFIKNSGQEASFMIWTGDSPPHVPVPELSTD 
TVINVITNMTTTIQSLFPNLQVFPALGNHDYW PQDQLSWTSKV 
YNAVANLWKP WLDEEA1 S TLRKGG FYS QRVTTNPNLRI I S LNTN 
L YYGPN IMTLNKTDPANQ FEWLE STLNNSQQNKE KVYI IAHVPV 
GYLPSSQNITAMREYYNEKLIDIFOKYSDVIAGQFYGHTHRDSI 
MVLS DKKGS P VNS LFVAPAVT PVKS VliEKQTNNPG IRLFQ YDPR 
D YKLLDMLQ Y YLM LTEANLKGES I W KL E Y ILTQTYDIEDLQP ES 
LYGIiAKQFTILDS KQF I KYYNYFFVS YDSSVTCDKTCKAFQ I CA 
IMNLDNISYADCLKQLYIKHNY 


6886 


2 


1341 


QCGG I PGREGGSS RPLEEGTGS SPACVRGAAPGSEDAFYPTRAK 
QARVSQELKKAAXRTVS IS EG PDTLGDGMRERRETLALAP EPEP 
LEKEACEKWKR PFR5 ASATS LTLS H CVDWKGLLDFKKRRGHSI 
GGAPEQRYQI I PVCVAARL PTRAQD VLDAHLSE VNAVRFG PNS S 
IiLATGGADRLIHLWNWGS RLEANQTLEGAGGS ITS VDFDPSGY 
QVLAATYNQAAQL WKVGEAQS KETLS GHKDKVTAAKFKLTRHQ A 
VTGSRDRTVKEWDLGRA YCS RT INVLS YCND WCGDHI I ISGHN 
DQKlRFWDSRGPHCTQVIPVQGRVTSLSIiSHDQLHLLSCSRDNT 
LKVI DL»R VSN IRQVFRADG FKCGSDWTKAVFS PDRS YALAGS CD 
GAL YI WD VDTGKLESRLQG PHCAAVNAVAWCYSGSHMVSVDQGR 
KWLWQ 


6887 


1047 


116 


WTARPSQKPFWEAGAVPGDPIiSTGCSQAQLGGCCPRGPWGPQHG 
GQQRAAGPTLPRGERGGPQQSGPGLAAQTPPTSKQVAWRAFLTG 
TYRSQS PRSPAGP FRGGTG WW PEPAVC LCVAVGPQRLfl S PGLVY 
NASGSEHCYDIYRLYHSCADPTGCGTGPDARAWDYQACTEINLT 
PASNNVTDMFPDLPFTDELRQRYCLDTWGVW PR PDWLLTS FWGG 
DLRAASN 1 1 PSNGNLD PWAGGGI RRNLS ASV I AVTI QGGAHHLD 
LRASHPEDPASWEARKLEATIIGEWVKAARREQQPAIiRGGPRL 
SL 


6888 


1 1 


992 


FVA YVKKEI PH I WTHCLLNPHALVI Kl'LPTKIiRDAiFT WRVI 
NFI RGRAPNHRLFQAFFEEIGI E YSVLLFHTEMRWLSRGQILTH 
IFEMYEEINQFLHHKSSNLVDGFENKEFKIHLAYLADLFKHLNE 
LSASMQRTGMNTVSAREKLSAFVRKFPFWQKRIEKRNFTNFPFL 
EEIIVSDNEGIFIAAEITLHIiQQLSNFFHGYFSIGDIiNEASKWI 
IjDPFIiFNIDFVDDSYIjMKNDIiAELRASGQILMEFETMKLEDFWC 
AQFTAFPNLAKTALE I LMPFATTYLCELGFS ITFTFQNKVPEAA 
LILS DDI RVAI SKKVPS FLGHH 


6889 


1 


1534 


ltlenqikeereqdnsespngrtsplvsqnneqgstlrdllttt 
agklr vgs tdag iafapvysmgaps s ks grtmpnildd i ias w 

ENKIPPSKTSKINVKPELKEEPEESIISAVDENNKLYSDIPHSW 
I CEKHI LWLKDYKNS SNWKLFKE CTJ KQGQ PAVVSGVHX KMITISL 
WKAES I S LD FGD HQ ADLIiNCKDS 1 1 SNANVKE FWDGF E B VS KRQ 
KNKSGETWLKLKDWPSGEDFKTMMPARYBDLLKSLPLPEYCNP 

egkfnlashlpgffvrpdlgprlcsaygwaaxdhdigttnlhi 

EVS DWNILVYVG I AKGNG ILS KAG ILKKFEEEDLDDILRKRLK 
DSSEIPGALWHIYAGKDVDKIREFIjQKISKEQGLEVLPEHDPIR 
EK3SWYVNKKLRQRLLBEYGVRTWTLIQFLGDAIVLPAGALHQVQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to firsb 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucieoLiue 

location 
corresponding 
to first 
amino acid 
residue of 

sequence 


Amino acid segment containing signal peptide" - 
(A= Alanine, C-»Cysteine, D=*Aspartic Acid, B= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, MoMethionine, N^Asparagine, 
PoProline, Q=Glut amine, RaArginine, 
s=Serine, ^Threonine, VoValine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *«»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NPHSCIQVTEDFVSPEHLVESPHLTQELRIiLKKEINYDDKLQVX " 
NILYHAVKEMVRALKI HEDBVDDMEEN 


6890 


3 


667 


THACGMW I PLYLHRAIiWHKTAETCNS PPCGAKDSL I FGAIT"CF 
TG FLGVDTGAGATRWCRIjKTQRADPLVCAVGMLGS A I P I CLI PV 
AAKS S I VGAY I CI FVGETLLFSNWAITADILMYWI PTRRATAV 
ALQSPTSHLLGDAGSPYLIGFlSDIiIRQSTKDSPLWEFLSLGYA 
LMLCPFWVLGGMFFLATALFFVSDRARAEQQVNQ1AMPPASVK 
V 


6891 


1980 


1262 


LRIHQELLSKELKLLRGITIESIIHIGLAAGKEQPMCiDASNVMQ 
LLLKTQSHLYNMBDNNPBVRQAAAYGLGVHAQPGGDDYRSLCSE 
AVPLLVK V I KRAHS KTKKNVI ATENCIS AI GKILKF KPNCVNVD 
E VLP H WLS WL PLHEDKEEAI QTLS FLCDLX E SNHP WI GPNNSN 
LPKI ISIIAEGKINETINYEDPCAKRLANWRQVQTSEDLWLEC 
VSQLDDEQQEALQELLNFA 


' S892 ~ 


3 


876 


RSVAAASGPGAWGTDHYCliELIjRKRDYEGYLCSLtiLPAESRSSV 
FALRAFNVEIAQVKDSVSEKTIGLMRMQFWKKTVEDIYCDNPPH 
QPVAIEIiWKAVKRHNLTKRWLMKI VDEREKNLDDKAYRNI KELE 
NYAENTQS SLLYLTLE I LG I KDLHADHAASH I GKAQG IVTCLRA 
TPYHGSRRKVFLPMDICMLHGVSQEDFLRRNQDKNVRDVIYDIA 
SQAHLHLKHARS FHKTVPVKAFPAFLQTVSLEDFLKKIQRVDFD 
IFHPSLQQKNTLLPLYLYIQSWRKTY 




1 


642 


DGERKSMSVERTFSEINKAEEQYSIjCQELCSEIiAQDLQICKRLKG 
RTVTIKLKNVNFEVKTRASTVSSVVSTAEBI FAIAKBLLKTEID 
ADFPHPLRLRLMGVRISSFPNEEDRKHQQRSIIGFLQAGNQALS 
ATBCTLBKTDKDKFVKPLEMSHK1CSFFDKKRSERKWSHQDTFKC 
EAVN KQS FQTSQPFQVLKKKMNENLE ISENSDDOQILTCPVCFR 
AQGC I S LEALNXHVDECLDG PS ISEN FXMFSCSHVS ATKVNKKE 
NVPAS S LCEKQD YEAH 




1742 


1463 


ttlckplvprehqfyetlpaemrkftpqykgksqlleglphwrg"" 
dvrdrghgrpwqpslepslpptlcfpslssfssswpsaqhltps 

VFNPW r *" 


6895" 


2379 


478 


VTYVELCDLASPTALLIMRTVLDLIVEDLQSTSEDKEQQYTSQT 
TRLLALL YAItASHKACKLAI LHL INGT I KGDERYAE I FQDLLAL 
VRSPGDSVIRQQCVEYVTSILQSLCDQDIALILPSSSEGSISEL 
EQLSNSLPNKEIiMTS I CD CLIiATLANSESS YNCLLTCVRTMM PL 
AEHDYGLFHLKSSIiRKNSSALHSLLKRVVSTPSKDTGELASSFL 
E FMRQ I LNS DT I G CCGD DNGLMEVEGAHTSRTMS INAAELKQ LL 
QS KEES PENLFLELEKLVLEHSKDDDNLDSLLDSWGLKQMLES 
SGDPLPLSDQDVEPVLSAPESLQNLFNNRTAYVLADVMDDQLKS 
M WFTP FQAEE I DTDLDLVKVDLI ELS BKCCSDFDLHS ELERS PL 
SEPSSPGRTKTTKGFKLGKHKHETPITSSGKSEYIEPAKRAHVV" 
PP PRGRGRGGFGQG IRPHDIFRQRKQNTSRP PSMKVDDFVAAES 
KEWPQDGI PPPKRPLKVSQKIS SRGGFSGNRGGRGAFHSQNRF 
FTPPASKGNYSRREGTRGSSWSAQNTPRGNYNESRGGQSNFNRG 
PLPPLRPLSSTGYRPSPRDRASRGRGGLGPSMASANSGSGGSRG 
KFVSGGSGRGRHVRS FTR 


6896 


1 


555 


GN IVIQKKKYNKQH1 IPLENVTIDSIKDEGDLRNGWLIKTPTKS 
FA VYAATATE KSE WMMI INKCVTDL LS KS GKT PSNEHAAVWPD 
SEATVCMRCQKAKFTPVNRRHHCRKCGFWCGPCSEKRFLLPSQ 
SS KPVRICDFCYDLLSAGDMATCQPARSDS YSQSLKS PLNDMSD 
DDDDDDSSD 


6897 


3 


920 


GDGI^HE\TVNGLMERPDWETAIQKPLCSLPAGSGNALAASLNHY ' 
AGYEQVTNEDLLTNCTLU.CRRLLSPMNLLSLHTASGLRLFSVL 
SIiAHGPI ADVDLESE KYRRI/3EMRFTLGTFLRLAALRT YRGRIA 
YLPVGRVGSKTPASPVWQQGPVDAHLVPLEEPVPSHWTWPDE 
D FVL VLALLHS HLGS EM FAAPMG R CAAGVMHL FYVRAG VSRAML 
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ID 
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beginning 
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location 
corresponding 
to first 
amino acid 
residue of " 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" 
insyuanine, <— cysteine, D««Aspartic Acid, Ea 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, Methionine, NoAsparagine, 
P=Proline, Q=Glut amine, R»Arginine, 
S s Serine . TsThT*f»nnH no \» _t i •< ne > 

WaTryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\opossible nucleotide insertion) 








lrlplamekgrhmbybcpylvyvpvvafrlbpkdgkgVfavdge 

LMVS EAVQGQVHPNYFWMVSGCVE PPPSWKPQQMPPPBE PL 


6898 


919 


346 


UJ\I V lAV/^JoJbiNJbKuui I il^KKRMGUwIKIRFFKIMLVIjIICW 
LSNI INESLLFYLEMQTDINGGSLKPVRTAAKTTWFIMGILNPA 
QGPLLSLAPYGWTGCSLGFQSPRKBIQWESLTXSAAEGAHPSPL 
MPHENPASGKVSQVGGQTSDEALSMLSEGSDASTIEIHTASBSC 
IN rsi St»DPAuPTHGDIj 


6899 


120 


627 


MKVRKNNDAYLLDKNKINMDCFISCPPKKI^TTI^FSHSGILSL 
LBHGEEYTFSLPCAYARS ILTVPWVELGGKVSVNCAKTGVSAS I 
TFHTKP FYGG KLHR VTAEVKHN I TNTWC RVQG E WNSVL E FTYS 
wLiur i A.r viJijTK-LiAVTKKRVRPLEKQDPFlsSRRLW 
EIDKATEHKHTLEERQRTEERHRTETGTPWKTKYF1KEGDGWVY 


6906 


3 


451 


TB VLGS KG IHE LRS STS ALHHALE ESASLLTMFWRAALPS THIP 
1 VLPGKVGESTERELIiELRTKVSQQEQIjLQSTTEHLKNANQQKES 
MEQF I VSQLTRTHDVLKKARTNLEVR KLLHQSEAPSLS PTHHHP 
IiADLVGDSWPALRFQEK 


6901 


1 


201 


DDNMV QRLBTDFKMTI^X2QSrLEQWAAWLDNVMMQALKPyEGRP 1 
SFPKAARQFLLKWSFYRYHIGFS , 


6902 


2 


267 


gapppppsqpprqppqaapsshphsdltfnpssalegqagaqgaH 

SDMPBPSLDIiLPELTNPDELLSYLDPPDLPSNSNDDLLSLFENN | 


C903 


1 


149 


RINQVYRQGPTG I HI LVI DQMVQN FQDESCFLFSTVKAESSbG T| 
HULK 


6904 


464 


2092 


MEASL P VSLS CVLACGDVEGKFDI LFNRVQAI Q KKSGNFDLLLC| 
VGNFFGSTQDAEWEE YKTG I KKAPIQTYVLGANNQETVKYPQDA 
DGCELAENITYLGRKGIFTG S SGLQ I VYLSGTBS LNEPVPG YS F 
SPKDVSSLRMMLCTTSQFKGVDILLTSPWPKCVGNFGNSSGEVD 
TKKCGSALVSSIATGI^PRYHFAAIiEKTYYBRLPYRNHIILQEN 
AQHATRF IALANVGN PE KX KYL YAPS I V PMKLMDAAE LVKQ PP D 

VTENPYRKSGQEASIGKQIIiAPVEESACQFFFDLNEKQGRKRSS 
TGRDSKSSPHPKQPRKPPQPPGPCWFCLASPEVEKHLWNIGTH 
CYIiALAKGGLSDDHVLILPIGHYQSWELSAEWEEVEKYKATL 
RRFFKSRGKWCVVFERNYKSHHLQLQVIPVPISCSTTDDIKDAF ' 
ITQAQEQQIELLE I PEHSD1 KQIAQPGAAYFYVELDTGEKLFHR 
I KKNF PLQFGRBVLASEAILNVPDKSDWRQCQI S KEDEETliARR 
FRKDFEPYDFTLDD | 


6905 


1 


226 


VSICTGEAETITSHYLFALGVYRTLYLFNWiWR^HFEGFFDLIAf 1 
VAGLVQTVLYCDFFYLYrTKVLKGKKLSLPA 


6906 


3 


611 


S YDDHNGHI DFITAASNLRAKM YS IE PADRFKTKRiAGKI I PAI 1 
ATTTATVSGLVALEM I KVTGG YPFEAYXNWFLNLAI P I WFTET 
TEVRKTKIRNGlSFTIWDRWTVHGKEDFTIiIiDFINAVKEKYGIE 
PTMWQGVK>ILYVPVMPGHAKRLKLTMHKLVKPT^ 
SFAPDIDGDEDLPGPPVRYYFSHDTD | 


6907 


2 


22^8 


LRGVP VWAAGAFR FS S GEESTS HL I MS RRSQR t/TR YS QGDDDGS H 
S S SGGSS VAGSQSTLPKDS PLRTLKRKSSNMKRLSPAPQLGPS S 1 
DAHTSYYSESLVHESWFPPRSSLEELHGDANWGEDLRVRRRRGT 
GGSESSRASGLVGRKATEDFLGSSSGYSSBDDYVGYSDVDQQSS 
S SRLRSAVS RAGSLLWMVATSPGRLFRLLYWWAGTTWYRLTTAA 
SLLDVFVLTRRFS SLKTFLW FLL PLLLLTCLTYGAWYFYP YGLQ 
TFHPALVSWWAAKDSRRADEGWEARDSS PHFQAEQRVMSRVHSL 
ERRLEALAAEFSSNWQKEAMRLERLELRQGAPGQGGGGGLSHED 
TLALLBGLVSRREAALKEDFRRETAARIQEELSALRAEHQQDSE 
DLFKKIVRASQES BAR IQQLKS EWQSMTQES FQBSS VKELRRLE 
DQ LAGLQQELAALALKQSSVAEE VGLLPQQ IQAVRDDVESQFPA 
WISQFLARGGGGRVGIjLQREEMQAQLRELESKIIjTHVAEMQGKS J 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
HaHistidine, I=Isoleucine, K= Lysine, 
LaLeucine, M=Methionine, N«Asparagine , 
P=Proline, Q=Glut amine, R^Arginine, 
ScSerine, TwThreonine, VaValine, 
W»Tryptophan, Y« Tyrosine, X^Unknown, +«Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








AREAAASLSLTLQKEGVIGVTEEQVHHIVKQALQRYSEDRIGLA 
DYAIiESGGASVISTRCSETYETKTALLSLFGl PLWYHSQSPRVI 
LQPDVHPGNCWAFQGPQGFAWRLSARIRPTAVTLEHVPKALSP 
NSTIS SAPKDFAI FG FD ED LQQEGTL LGKFTY DQDGB P I QTFHF 
QAPTMATYQ WE LR I LTNWGH PE YTC I YRFR VHG BPAH 


690B 


3 


780 


QVPSAAWLMAVCGLGSRLGliGSRLGLQGCFGAARLLYPRFQSRG 
PQGVEDGDRPQPSS KTPRI PKI YTKTGDKG FS STFTGBRR P KDD 
QVPBAVG'rTDELSSAIGFALELVTEKGHTFAEELQKlQCTLQDV 
GSAIATPCSSAREAHLKYTTFKAGPILELEQWIDKYTSQLPPLT 
AFILPSGGKISSALHFCRAVOUIAERRVVPLVQMGETDANVAKF 
LNRLSDYLFTLARYAAMKEGNQEKIYKKNDPSAESEGL 


6909 


3 


409 


GRIiLAVGTDLYGQRSSAPEQELLVQDATPVSNS LLPEKAFSDI P 
SPYLRGTIKMMQAVRQAFQDQDDRRTWDGRPLTMAATFDDCLYA 
LCVVDTIKRSSQTGEWQNIAIMTBEPEIiSPAYLISEAMRRSRMS 
LYC 


6910 


1 


1068 


LVPWVIDS YYYGKLVI APLNIVLYNI FTPHGPDLYGTE PWYFY 
LING FI27FNVAFALALLVLPLTS LME YLLQRFHVQNLGHP YWLT 
LAPMYI WFI I FFIQPHKEERFLFPVYPLICLCGAVALSALQHSF 
LYFQKCYHFVTQRYRLEHYTVTSNWLALGTVFLFGLIiSFSRSVA 
LFRGYHGPLDLYPEFYRIATDPTIHTVPEGRPVNVCVGKEWYRF 
PSSFLLPDNWQLQFIPSBFRGQLPKPFABGPIiATRIVPTDMNDQ 
NLEEPSRYIDI3KCWYLVDLDTMRETPREPKYSSNKEEWISLAY 
RPFLDASRS S KLLRAFYVP FLS DQ YTVYVNYT ILKPRKAKQ IRK 
KSGG 


*911 


1184 


966 


G EDAEEMETGN VANLI S I FGS8FSGLLRKSPGGGREEEEGEESG 
PEAAEPGQICCDKPVLRDMNPWSTAIVAF 


6912 


1 


844 

r 


AMKP VETHSFQMLFTILSTGSALKAQS YEDAYRCI KSS ILLGS I 
SGGTDI IS CFMGHNFS L P VYKGEI QARNLGMAVEAV7NEEG KAVW 
GESG ELVCTKP I PCQPTHFWNDENGMKYRKAYFS KFPG I WAHG D 
YCRINPKTTGGIVMLGRSDGTLNPNGVRFGSSEIYNIVES FEEVE 
DS LCVPQYNKYRE ERVI LFL-KMAS GHAFQPDL VKR I RDA I R$G L 
SARHVPSLILETKGIPYTLNGKKVEVAVKQIIAOKAVEQGGAFS 
NPBTLDLYRDIPELQGF 


6913 


1643 


. 1558 


KKSHEESHKEELSYGAQASLPLPCSDFR 


6914 


1251 


615 


ELAAECKSAGY PGTLIPYRCDLSNEEDILSMFSAt RSQHSGVDl" 
CINNAGIiARPDTLLSGSTSGWKDMFIJVNVLALSICTREAYQSMK 
ERNVDDGHI ININSMSGHRVLPLSVTHFYSATKYAVTALTEGLR 
QELREAQTHIRATCISPGWETQFAFKLHDKDPEKAAATYEQMK 
CLKPEDVAEAVI YVLSTPAHIQIGDIQMRPTEQVT 


6 915 


254 


652 


GRSLS FKTFIiI WVLIS I YQGGILMYGALVLFESEF VHWA I S FT 
AL I LTELLMVALTVRTWH WLMVVABFLS LGCYVSS LAFLNE YFD 
VAF I TTVTFLWKVS AITWS CLPL YVLKYLRRKLS P PS YCKLAS 


6916 


254 


652 


GRSLS FKTFLI VfVLIS I YQGGILMYGALVLFESEF VHWAI S FT 
ALILTELLMVALTVRTWHWLMWABFLSLGCYVSS LAFLNEYFD 
VAFITTVTFLWKVSAITVVS CLPL YVLKYLRRKLS PPS YCKLAS 


6917 


254 


652 


GRSLS FKTFLI WVIjISI YQGGILMYGALVLFESEFVHWAI SFT 
ALILTELLMVALTVRTWHWLMWAEFLSLGCYVSS LAFLNEYFD 
VAF I TTVTFLWKVSA1TVVSCLPLYVLKYLRRKLS PPS YCKLAS 


691B 


28 


£21 


PEAGTRS WRE PD P EDLRRFLLS AACRS FPQWL PGGGGGQVS S CS 
DTDVP YLLLAVKSEPGRFAERQAVRETWGS PAPGI RLLFLLGS P 
VGBAGPDLDS LVAWESRRYS DLLLHD FLD VP FNQTLKDLLLLAW 
LG RH C PTVS FVLRAQDD AFVHT PALLAHLRAL P PAS ARS L YLG E 
VFTQ AM PLRKPGG PFYVPES FFEGGYPAYASGGGY VIAGRLAPW 
LLRAAARVAPFPFEDVYTGLC I RALGLVPQAHPGFLTAWPADRT 
ADHCAFRNLLLVRPLGPQAS IRLWKQLQDPRLQC 
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ID 
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location 

corresponding 

to first 

ava±u\j dClu 
CcolQUc OX 

amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, CsCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, MaWethionine, N=Asparagine , 
P=Proline, Q=Glut amine, R^Arginine, 
S=Serine, T»Threonine, v=Valine, 
W=Tryptophan, Y«Tyrosine, X«Vnknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6919 


850 


41 


QGRRELSGSVFCPFIQQEPKEMLTIjSEyHERVRSQGQQIiQQLQA 
E LDKLHKEVS TVRAANS ERVAKLVFQRLNEDFVRKPDYALSSVG 
AS I DLQKTSHDYADRNTAYFWNR FS FWNYARPPTV ILE PHVFPG 
NCWAFEGDQGQWIQLPGRVQLSDITLQHPPPSVEHTGGANSAP 
RDPAVPFLLSPFTHQGLQVYDETEVSLGKFTPDVEKSBIQTFHL 
QNDP PAAFP KVKI Q I LSNWG H P RFTCLYRVRAHGVRTS EGAEGS 
AQGPH 


6920 


1418 


591 


EAQGPSKVHLTLKKKK 


6921 


2 


1711 


MNATRSEEQFHVINHAEQTLRKMEitoKEKQLCKVLLIAGHLRI ™ 

PAHRiVLSAVSDYPAAMFTNDVLEAKQEEVRMEGVDPNAIoNSLV 

Q YAYTGVLQL KEDTIESLLAAACLLQLTQ VI DVCSNFLI KQ LHP 

SNCLGIRSPGDAC^CTEUjNVAHKYTMEHPIBVIKNQEFLiLLPA 

NE I SKXiLC5 DD INVPDE ETI FKALMQ WVGHDVQNRQGE LGMLLS 

YIRLPLLPPQLLADLETSSMFTGDLECQKLLMEAMKYHIjLPERR 

SMMQSPRTKPRKSTVGALYAVGGMDAMKGTTTIEKYDLRTNSWL 

HIGTMNGRRLQFGVAVIDNKLYWGGRDGIJCTLNTVECFNPVGK 

IWTVMPPMSTHRHGU^ATIlEGPMYAVGGKJX5WSYL^mrERWDP 

EGRQV^VASMSTPRSTVGWALNNXLYAIGGRDGSSCLiCSMEY 

PDPHTNKWS LCAPMS KRRGGVGVATYNG PLY WGGHDAPASNHC 

SRLSDCVERYDPKGDSWSTVAPIiSVPRDAVAVCPLGDKLYWGG 

YDGHTYLNTVESYDAQRNEWKEEVPVNIGRAGACVVVVJCLP 


6922 


1075 


369 


LTPPAGIRHEVRDRBREREREREREKFPLDSTGSELKQNlriSi'r 
GIjP PAMQKVM YKGI1APEDKTI1REIICVTSGAKI MGGGSTINDVLA 
VOTPKIMVAQQDAKAEENKKEPLCRQKQHRKVLDKGKPEIJVMPSV 
KGAQERLPTVPLSGMYWKSGGKVRLTFKLEQDQLWIGTKERTEK 
LPMGS IKNWSEPIEGHEDYHMMAFQLGPTEAS YYWVYWVPTQY 
VDAI KDTVLGKWQYF 


6923 


2469 


1660 


LGLFCILPIDTLCAVLERDTLSIRESRLFGAVVRWAEAECQRQQ " 
LPVTTCNKQKVLGKAIiSLIRFPLMTIEEFAAGPAQSGILSDREV 
VNLFIiHFTVNP JCPRVE YI DRPRCCLRGKECCINRFQQVESRWGY 
SGTSDRI RPT VNRRI S I VGFGLYGS IHGPTDYQVNIQI IEYfifcK 
QTLGQNDTGFS CDGTANTFRVMFKE P IE I LPNVC YTACATLKGP 
DSHYGTKGLKKVVHETPAASKTVPFFFSSPGNNNGTSIEDGQIP 
EIIFYT 


6924 


2210 


1235 


PEERVICFVEYYLTAFHEGRKGAIiAKKPYNPI IGETFHCS WEVP ' 
KDRVKPKRTASRS PAS CHE HPMADDP S KS YKLRFVAEQVSHHP P 
ISCFYCECEBKRLCVNTHVWTKSKFMGMSVGVSMIGEGVLRLLE 
HGEEYVFTLPSAYARSILTIPWVELGGKVSINCAKTGYSATVIF 
HT KP FYGGKVHR VTAEVTCHN PTOTI VCJCAHGE WNGTLE FTYNNG 
ETKVlDTTTLFVYPKKIRPLEKQGPMESRNLWREVTRYLRIiGDI 
DAATEQKRHLEEKQRVEERKRENLRTPWKPKYFIQEGDGSGILQ 
SPLESTLMGLEVQSFPV 


6925 


2 


1653 


RGGAAGAAMBPDSVIEDKTIELMCSVPRSLWLGCANLVESMCAL 
SCLQSMPSVRCLQISNGTSSVIVSRKRPSEGNYQKEKDLCIKYF 
DQWSESDQVEFVEHLISRMCHYQHGHINSYLKPMLQRDFITALP 

ERMVRTDPLWKGLSERRGWDQYLFKNRPTDGPPNSFYRSLYPKI 
IQDIETIESNWRCGRHNLQRIQCRSENSKGVYCLQYDDEKI ISG 
LRBNS I KI WDKTS LECLKVLTGHTGS VLCLQ YDERVI VTGS SDS 
TVRVWDVNTGBVLNTLIHHNEAVLHLRFSNGIiMVTCS KDRS IAV 
WDMAS ATDITLRRVLVGHRAAVNVVD FDDKYI VSASGDRTI KVW 
S TSTCE FVRTLNGHKRG I ACLQ YRDRL WS GSS DNTIRLWD IE C 
GACLRVLEGHEELVRCIRFDNKRI VSGAYDGKI KVWDLQ AALDP 
RAPASTLCLRTLVEHSGRVFRLQFDEFQIISSSHDDTILIWDFIi 
NVPPSAQNETRS PSRTYTY I SR 


6926 


1 


733 


SGRVAMDGLGLQFPEQGFPAGPPLLPPHMGGHYRDO^SLGAPPL 
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I SEQ"~ 
ID 

NO: 


1 Predicted " 

beginning 

nucleotide 

location 

corresponding 
1 to first 

amino acid 

residue of 

amino acid 
J sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A- Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
HoHiatidine, I=Isoleucine, K=Lysine, 

P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine, VoValine, 
W=Tryptophan, Y= Tyro sine, X»Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 








DaYPLPTPDTSPLDGVDPDPAFFAAPMPGDCPAAGTYSYAQVSD 
YAGPPEPPAGPMHPRLGPEPAGPSIPGLIAPPSALHVYYGAMGS 
PGAGGGRGFQMQPQHQHQHQHQHHPPGPGQPTPPPBALPCRDGT 
DPSQPAELLGEVDRTEFEQYLHFVCKPEMGLPYQGHDSGVNLPD 
SHGAI SSWSDASSAVYYCNYPDV 


6927 


2 


1484* 


LTLCGDIQLMLAQNANNRAAHLEKFHYQTKEDQEILHSLHRESS 
CQGFAWATDLSTDLESQLSVSCKCYEAANBILQFRDLKSQNPEH 
YVQVL KRMGN I RNE I GV FYMNQAAALQS ERLVS KS VS AAEQQIjW 
KKS FSCFEKGIHNFES IEDATNAALLLCNTGRLMRICAQAHCGA 
GDELKRE FS PEEGLY YNKAI D YYLKALRS LGTRDIHPAVWDS VN 
WBLSTTYFTMATLQQDYAPLSRKAQEOIEKEVSEAMMKSLKYCD 
VDSVSARQPLCQYRAATIHHRLASMYHSCLRNQVGDEHLRKQHR 
VLADLHYS KAAKLFQLLKDAPCELLRVQIiERVAFAEFQMTSQNS 
NVGKLKTLSGALDIMVRTEHAFQLIQKEXiIEEFGQPKSGDAAAA 
ADASPSLNREEVMKLLS I FESRLSFLLLQSIKLLSSTKKKTSNN 

IEDDTILKTNKHIYSQLLRATANKTATLLERINVIVHLLGQLAA 
GSAASSNAVQ 


5928 


1086 


777 


EAIDIj INNLIjQVKMRKRYS vdktlshpwlqdyqtwldlreleck 
igeryithesddlrwekyageqglqypthlinpsashsdtpste 
etbmkalgbrvsil 


6929 

• 


1749 


607 


rdqrgyrddrsparepgdvsartrsgggggrsattampppvpng"*" 

nlhqhdpqdlrhngnvwagrpscsrgprraiukpqpaggrrsg 

rgpaagglclqppdggtcvpebppvp pmdwealekhlaglqfre 

qevrnqgqartnstsaqkneresirqklalgsffddgpgiytsc 

sksgkpslssri^sgmnlqicfvndsgsdkdsdaddsktetsld 

tpi^pmskqsssysdrdtteeeseslddmdfltrqkklqaeaicm 

alamakpmakmq vevekqnrkks pvadllphm ph i s eclmkrs l 

KPTDLRDMTIGQLQVIVNDLHSQIESIiNEELVQLIjLIRDEIjHTE 

qdamlvdiedltrhaesqqkhmaekmpak 


6930 
" 693l"| 


131 


545 


Fra^TANVFVSLFQMRNNFRHYFIEPSQtiKtFYDVITWI^TQVAi 

sytwpfvllsikpsltfysswyyclhxlgilvllllpvkktqr 

RKNTHENIQLSQS KKFDEGEKSLGQNS FSTTNNVCtf QNQE IAS R 
HSSLKQ 




2 


659 


FVERLPNRPACLLVASGAAEGVSAQS FLHCFTMASTAFNLQVAT 
PGGKAME FVDVTE SNARWVQDFRLKAYAS PAKLES I DGAR YKAL 
LrPSCPGALTDLASSGSLARILQHFHSESKPICAVGHGVAAXCC 
ATNEDRS WVFDS YS LTGPS VCELVRAPG FARLPLWEDFVKDSG 
ACFS AS E PDAVHWIiDRHL VTGQKAS S TVPA VQNLL FLCGS RK 


6932 | 


2 


1131 


FVDSPGQGEQAEEEEGGIQMNSRMRAHS PABGASVES SS PGPKK 
S DMCEGCRSIiAAGHPGYISHDKETS IKYVSHQHPSHPQIjFS IVR 
QACVRSI/S CE VCPGREGPIFFGDEQHGFVFSHTFF I KDS LARGF 
>» R " * a * A 1 ■LririUK. x i Li J. iTi' JrliLGKVRQ 1 1 DELQGKALKVFEA 
BQFGCPQRAQRMNTAFTPFLHQRNGNAARSLTS LTSDDNLWACL 
HTSFAWLLKACGSRIiTEKLLEGAPTEDTLVQMEKLADLEEESES 
W DNS EAE EEEKAP VLPE STBGRELTQGPAESS SLSGCGS WQPRK 
LPVFKSLRHMRQVGGRGTAHHEIiRRRAliWGLCLPTRIiASGPSTL 
KTLQEVTDS LLGG WLMAQGVGG I 1 


6933 


1431 


890 


SLNLHCTLPPPPHQYPAGYPSDKEGKKPKGQSKKQPSGTTKRPI" 
SDDDCPSASKVYKASDSAEAIEAFQLTPQQQHIilREDCQNQKLW 
DE VLSHLVEGPNFLKKLEQS FMCVCCQELVYQ P VTTECFHNVCK 

DCLQRSFKAQVFSCPACRHDLGQNYIMIPNEILQTLLDLFFPGY 
SKGR 


6934 


3030 


2588 


DRDHSQCGG IRRVALARVSS VKLI SKAKIRTVKMTFI I VLAFI V 
CWTPPFFVQMWSVWDANAPKBASAFI I VMLIASLNSCCNPWI YM 
LFTGHLFHELVQRFLCCSASYLKGRRIiGETSASKKSNSSSFVliS 
HRSSSQRSCSQPSTA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A- Alanine, C«Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=»Phenyl alanine, G=Glycine, 
HoHistidine, I=Isoleucine, K^Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
Poproline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine, v=Valine, 
WnTryptophan, Y=Tyrosine, X=»t7nknown, *=sStop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


6935 


886 


543 


NSAXlYVAGGNDGTSCI^NSVBRYSPKAGAWESVAP^4NIRRSTHDL 
VAMDGWLYAVGGNDGSSSLNS IEKYNPRTNKWVAASCMFTRRSS 
VGVAVLELLNFPPPSSPTLSVSSTSL 


6936 


1347 




— £67 


13 t?HT?DOTTTiQT?ivT.T.Pl?CV2VCUDDt>UDT T?D Vcr Kmr< T UVCUT Tver m 

TCLHFLRKRLQKGEVGLSVETSKPQVPVGGLSRKKVPQEPWATV 
ME KRLQE AQL YKEEGNQRYR BG KYRDAVSRYHRALLQLRGLD PS 
LPS PLPNLGPQG PALTPEQEN ILH1TQTDC YNNIoAACLLQM BP V 
NYERVRE YSQ KVLERQ P DNAKAL YRAGVAF FH LQDYDQARU YLL 
AAVNRQPKDANVRRYLQLTQSELSSYHRXEKQLYLGMFG 


6937 


± 


=== 

/z / 


AVEr RCC PGRDPAC FARGWRLDR V YGTC FCDQACRFTGDCCPDx 
DRACPARPCFVGEWSPWSGCADQCKPTTRVRRRSVQQEPQNGGA 
P CP PLE E RAG CLEYS TPQGQDCGHTYVPAF I TTSAFNKERTRQA 
TSPHWSTHT3DAGYCMEFKTESLTPHCALENRPLTRWMQYXREG 
YTVCVDCQPPAMN3VSLRCSGDGLDSDGNQTLHWQAIGNPRCQG 
TWKKVRRVDQCSCPAVHSFIFI 


6938 


3 


713 


NSRKLELAERVI7rDFMQLKKRRQSSEKBNDSG*rUDTVGA\nAa)ri" 
EGNVAAAVS SGGLALKH PGR VGQAALYG CGC WAENTGAHNPYS T 
AVSTSGCGEHLVRTI LAR E CS HALQAEDAHQ ALLETM QNKP I S S 
P FLASEDGVLGGVI VLRS CRCSAE P DS SQNKQTLLVE FLWSHTT 
BSMCVGYMSAQDGXAXTHISRLP PGAVAGQSVAIEGGVCRLGEP 
SELTLQAECEASQRHFRT 


6939 


3 


610 


KVTAPRRPQRYSSGHGSDNSSVLSGELPPAMGRTALFHHSGGSS ' 
GYES LRRDSEATGSAS SAPDSMSESGAAS PGARTRSLKSPKKRA 
TGL0RRRLIPAPIJDTTAU3RKPSLPGQWVDLPPPLAGSLKEPF 
E I KVYE I DDV E RLQR PR PT PREAPTQGLAC VS TRLR LAE RRQQR 
LRE VOAKH KHIiCEELAETQGRLMLE PGRWLEQFEVD PELE PES A 
EyiJ^ALERATAALEQCVNLCKAHVMMVTCFDISVAASAAIPGPQ 
BVDV 


6940 


1188 




G KMAAQ P LRHRS RCATP PRGDFCGGTERAI DQAS FTTS ME WDTQ 
VVKGSSPLGPAGLGAEEPAAGPQLPSWLQPERCAVFQCAQCHAV 
iiAUS Vii±i/iW LJjjbKSIjQAVVr 5R VTNNVV1jKAPFL»VGIEGSLjKGS 
TYNLLFCGSCGIPVGFHLYSTHAALAALRGHFCLSSDKMVCYLL 
KTKAIVNASEMDIQNVPLSEKIAELXEKIVLTHNRLKSLMKILS 
EVTPDQSKPEN 


6941 


1 


713 ' 


SLSRADSDPHGPHTamVLNVIIGSNVLAIiAEAQRQAEALGYQA 
VVLSAAMOXSDVKSMAQFYGIaLAHVARTRLTPSMAGASVEEDAQL 

RGGRNOELALRVGAELRRWPLGPI DVLFLSGGTDGQDGPTEAAG 
AWVT PELASQA A AEGLDIATF1J\HNDSHTPFCCLQGGAHLI*HTG 
MTGTNVMDTHIiLFLR PR 


6942 


1 


246 


GDYVERYDPKTDTWTMGAPLSMPTNAVGGCriliGDRLYADGGYDG 
O^LNTMESYBPQTNEWTQMASLNIGRAGACVVVIKQP 


6943 


1 


73 9 


PMATG DGAKTLAI HVKALTADS IRITW KATLPAS S FRLS WLRLG 
HSPAGGS ITBTLVQGBKTEYIiLTALEPKPTYI I CMVTMETTNAY 
VADETPVCAKAETADSYGPTTTLNQEQNAGPMASLPLAGIIGGA 
VALVFLFLVLGAICWYVHQAGELLTRERAYNRGSRKKDDYMESG 
TKKDNS ILE IRGPGLQMLP I NPYRAXEEYWHTI FPSKGSSLCK 
ATHTIG YGTTRG YRDGG I PD I DYS YT 


6944 


960 ' 


156 


VAN I LLNGVKYES ELTGS S ERAEQPLS VGRLCS T I CNM PKAltRT 
LCVNHFLGWLSFEGMLLFYTDFMGEWFCX3DPKAPHTSEAY0KY 
NSGVTMGCWGMCI YAFSAAFYS A ILEKLEEFLS VRTLYFIAYLA 
FGWTGLATLSPJfLYVVLSLCITYGILFSTLCTLPYSLLCDYYQ 
SKKFAG SS ADGTRRGMGVDIS LLS CQ YFIAQ I LVSLVLG PLTSA 
VGSANGVMYFSSLVSFLGCLYSSLFVIYEIPPSDAADEEHRPLL 
LNV 
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ID 
NO: 


Predicted 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
Eequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=»Cysteine, D=*Aspartic Acid, E=» 
Glutamic Acid, F« Phenyl alanine, G=61ycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine . 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine , V«Valine, 
W*Tryptophan, Y= Tyrosine, X=Unknovn, *=Stop 
Codon, /".possible nucleotide deletion, 
\=possible nucleotide insertion) 


6945 
6946 


2067 


179 


EGEDRGLPRTMGAAIiGTGTRIAPW PGRACGALP RWT PTAPAQGC ~ 
HSKPGPARPVPLKKRGYDVTRNPHLNKGMAFTLBERLQLGIHGL 
IPPCFLSQDVQLLRlMRYYKRQQSDLDKYIILMrLQDRNEKLFY 
RVLTS DVBKFMP I VYT PTVGLACQH YGLTFRRPRGLF I T I HDKG 

x ™"» " r aUS* x r\K v v v X LAa art A IaJUJuUS C X GMG X P VGKXiA 

LYTACGGVNPQQCLPVLIiDVGTNNEELLRDPLYXGLKHQRVHGK 
AYDDIiLDBFMOAVTDKFG 1 NCL IQFEDFANANAFRLLNKYRNKY 
CM FNDD I QGTAS VAVAG I LAALR I TKNKL3NHVFG FQ G AG EAAM 
G \ I AHTjLVMALE \ KEGVPKA \ EATR K I W \MVDF \ K3GL I VQGRDH 
LNHEKEMFAQD\HPEVNSLEEVVRLVKPTAIIGVAAIAEA\FTE 
QILRDMASFHERP\IIFALSNPTSKAECTA\EKCYRVTEGPRGF 
FAS \ GS PF *G VL I WEMGKTFI PGGRGNNA*R VPRGWQLGVHSPG 
GDPGHI P \ DE I FliPDSRAKIiPQEVS EQHLSQGRL YP \ PLS T \ I R 
NVFLRIA1 KVFD * G YKHNLV\S YY PE PKD\ KEAFCK I PG S YTPD 
YDSFYT/VDSYIWAQGKAMNVQTV 




133 


2551 


SraYSGITVAPGDPCPGVTtfibLAPSMA^ 

NLDGTLGYLIJ}102TLRIjHPDIFLPSE1\CDRLVNEYVELVNAAC 

NF\EPHE\SFFNPLFRDPRKQPASRRIHL\RED\LVQD\QD\LE 

AIRKQDL\VEL\YLTN\CEKLSAKSI^TLRSFSHTLGVP*AFFG 

c\tnilllrkenpggl/ CEDEYLFNPTCQVLVEDFTFEGFSRLR 

F\LKLGRMIDWVPVES\liRPI^SIAAIiDI£GIO/rSDAA\^ 

WKDSL\VSLVL\YW©LSDDHIR\VIVQLHK1JUILDISRDRL3S 

YYKFKLTREVLSLFVQKLGNI^SLDISG\HMILENCSISKIGKR 

EAGQTS I \EPSK\SSI IPFRGFEGGPLQF\LGVF*GI FCGRLTH 

I PAY KV5GDKNEEQV1jNAI E AYTEHR PE ITSRAINLLFD XARX b 

RCNQI^RALKLVITArjKCHKYDRNlQVTGSAALFYLTNSEYRSE 

QSVKIJIRQVIQVVLNGMESYQEVTVORNCCXTLCNFSIPEELEF 

QYRRVNELLLS I LNPTRQDBS IQRIAVHLCMALVCQVDNDHKEA 

VGKMGFVVTMIJCLTQKKLIJ)KTCDQVMEFSW\SALWN'ITDETPD 

NCEMFLNFNGNKLFIJDCI^EFPEKQELHRNMIiGLLGNVAEVKEL 

RPQLMTSQFI S V FSMjLES KADG IEVS YNACG VLSH I M FDGPEA 

WGVCEPQREEVEERMWAAIQSWDINSRRNXNYRSFEPILRIjLPQ 

GISPVSQHWATWALYNIiVSVYPDKYCPLLIKEGGMPLLRDI IKM 

ATARQETKEMARKVIEHCSNFKEENMDTSR 


6947" 


2 


1682 


TSVSTIPRGIiASARPQSRSWRCCPVWRRSPGRARGRGLKMLNVP 
SQSFPAPRSQOJIVASGGRSKVPIJCQGRSLMDMIRLTKSGKDLTG 
LKG RL I EVTEEELKKHNKKDD CWI CXRG FVYNVS P YME YHPGGE 
DELMRAAGSDGTELFDQVHRWVNYESMLKECLVGRMAX KPAVLK 
DYREEEKKVLNGMLPKSQVTDTliAKEGPSYPSYDWFQTDSIiVTX 
/EHlY*TEGYQFRIdWS*SSE*FLYSRNNY*GLLISYTYW/R*A 
MRFRKIFLCGL/CESVGKIEIVLQKKENTSVraFIXSKPLKNHNSL 
X PRKDTGL YYRXCQLXS KEDVTHDTRL FCLMLPPSTHLQ VP IGQ 
HVYLKLPITGTEIVKPYTPVSGSLLSEFKEPVLPNNKYIYFLIK 

IYPTGLFTPELiDRLOTGDPVQVff «3Dl?r!TJPirTCt7t7nt'T etit t-»t y 

AGTGFTPMVKILNYALTDI PSLRKVKLMFFNKTEBDI X WRS QLE 
KTiAPKDKRLDVBFVLSAPI SEWNGKOGH I S PALXiS E FLKRNLDK 
SKVLVCXCX5PVPFTEQGVRLLHDLNFSKNEIHSFTA 


6948 


104 


58 


PDGAHSFFPDE YFTCSSLCLS CGVG CKKS MNHG KEG VPHEAKSR 
CR YSHQ YDNRVYTCKACYERGEEVS WPKTS ASTOS PWMGLA KY 
AWSGYVI ECPNCGVVYRSRQYWFGNQDPVDTVVRTE I VHVWPGT 
DGFLKDNNNAAQRLLDGMNFMAQSVSELSLGPTKAVTSWLTDQI 
APAYWRPNSQXLS CNKCATSFKDNDTKHHCRACGEGFCDSCS S K 
TR P VPERGWGPAP VRVCDNCYEAR/ TRPVS CYRGTSGR * RRRRT 
QETVE 


6949 " 


152 


46S6 


GLRLCLSRPLTRPGDDSVGGSAMASGAGGVGGGGGGKIRTRRCH " 
QGPIKPYQQGRQQHOGILSRVTESVKNIVPGWLQRYFKTKNEDVC 
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SEQ 
ID 

NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A-Alanine, CCysteine, D=Aspartic Acid, E= 
Glutamic Acid, P=Phenyl alanine, G=Glycine, 
H^Histidine, Ielsoleucine, K=Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
P°Proline, Q=Glutamine, RoArginine, 
S=Serine, T=Threonine, V« Valine, 
W=Tryptophan, Y=Tyrosine, X-Onknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 



"6950" 



"258T 



411 



! S CSTDT S B VPRWPBNKEDHLVYADEESSNITDGR I TPEPA VSNT ' 

BEPSTTSTAST\YPDVLTRVSLYRSHLNFSMLESPALHCQPSTS 
\ SAFP IGS SG FSLVKE 1 KDSTSQHDDDNIS TTSGPSSRAS D KD IT 
VSKNTSLPPLWSPEAERSHSLSQHTATSSKKPAFNLSAFGTLSP 
SLGNSS ILKTSQW3DS PF YPGKTTYGGAAAAVRQSKLRNTP YQA 
PVRRQMKAKQLSAQS YG VTS S TARRI LQSLEKMSS PLADAKRI P 
1 S I VS S PLNSPLDRSG ID ITD FQAKREKVDSQYP PVQRLMTPKPV 
S IATNRSVYPKPSLTPSGEFRKTNQRIDKKCSTGYEKNMTPGQN 
REQRESGFSYPNFSLPAAKGLSSGVGGGGGKMRRERHAFVASKP 
tiEEEEMEGPVLPKISLPITSSSLPTFNFSSPEITTSSPSPINSS 
[ QAI#TinCVQMTSPSSTGSPMPKPSSPIVKSTEANVLPPSSlGFTF 
SVP VAKTAHLSGSSSTLEP I ISSSAHHVTTVNSTNCKICTP PEDC 
! EGPFRPAEILKEGSVLDILKSPGFASPKIDSVAAQPTATSPWY 
| TRPAISSFSSSGIGFGESIJCAGSSWQCDTCLI^NKVTDNKCIAC 
QAAKLSPRDTAKQTGIETPNKSGKTTLSASGTGFGDKFKPVIGT 
1 WDCDTCLVQNKPEAI KCVACE^PKPG TCVKRALTLTvVs E SAET 
I MTASSSSCTVTTGTLGFGDKFKRPIGSWECSVCCVSNNAEDNKC 
I VSCMSEKPGSSVPTSSSSTVPVSLPSGGSLGLEKFKKPEGIWDC 
ELCLVQNKADSTKCLACESAKPGTKSGFKGFDTSSSSSNSAASS 
SFKFGVSSSSSGPSQTLTSTGNFKFGDQGGPKIGVSSDSGYINP 
MSEGP* FS KHI VGFKFG VS S ES KPEE VKKDSKNDN PKPGLS FGL 
SNPVFLTPFQFGVSNLGQEEKKEELLKSSCAGFRFGTGVINSTR 
j VP ANT I VTS ENKSS FNLGT I ETKS VS VAP LKCQTS EAKKEEM P A 
TKGG FS FGNVBPAS LPSAS VFVLGRTEE KQQE PVTSTSLVFGEG 
KI*TMKEPKC\QPVFSFGEFQRQTKDENSSKSTFSFSMTKPSEKE 
SEQPAKATFAFGAQTNTTADQGAAKPDLSYLNNSSSSSSTPATS 
AGGG\IPGSSTSSSNPPVATFVFGQSSNPGSSS\AFGMTAESST 
SQSLLFSQDSIOATTSSTGTAVTPFVFGPGASSNNTTTSGPGFG 
I ATTTSS SAGS SFVFGTGPSA PSAS PAFGANQTPTFGQSQG AS QP 
j NPPGFGS ISSSTALFPTGSQPAPPTFGTVSSSSQPPVFGQQPSQ 
S AFG SQTTPNS SS AFQFGS STTNFNFTNNS PSGVFTFGANS STP^ 

AASAQPSGSGGFPFNQSPAAFTVGSNGKNVFSSSGTSPSGRKIK 
TAVRRRK 

PRPG5 RSGLCRRAGE Rd3A V RAGGLSRRTRAK * I MDE iiHYQDTDS 

DVPE^RDSKCKVKMTHEEDEQLRALVRQPGQQDWKPIjASHPPNR 

TDQQCQYRWLRVIiNPDI,VKGPWTKEEDQKVlEIiVKKYGTKQWTL 

IAKHLKGRLGKQCRERWHNHLNPEVKKSCWTEEEDRI ICEAHKV 

LGNRWAEIAKMLPGRTDNAVKNHWNSTIKRKVDTGGFLSESKDC 

KPPVYLLLELEDECDGLQSAQPTEGQGSIiLTNWPSVPPTIKEEEN 

SEEELAAATTSKEQEPIGTDLDAVRTPEPLEEFPKREDQEGSPP 

ETSLPYKWVVEAAWLLI PAVGSSLSEALDL I ESDPDAWCDLSKF 

DLPEEPSAEDS INNSLVQLQASHQQQVLPPRQPSA\LVPSVTBY 

RLDGHTISDLSRSSRGELIPISPSTEVGGSGIGTPPSVLKRQRK 

RRVALSPVTENSTSLSFLDSCNSLTPKSTPVKTLPFSPSQFLNF 

WNKQDTLEI^PSLTSTPVCSQKVVVTTPLHRDKrPLHQKHAAF 

VT PDQKYS MDNTPHTPTP FKHALEKYGP LKPLPQTPHLEEDLKE 

VLRSEAGIELIIEDDIRPEKQKRKPGLRRSPIKKVRKSLALDIV 

DEDMKlMiSTLPKSLSLPTTAPSNSSSLTLSGrKEDNSLLNQGF 

LQAKP EKAAVAQKPRSHFTTPAPMS S AWKTVACGGTRDQL FMQE 

KARQLLGRLKPSHTSRTLILS 



239 



agpddtmkrslqalycqllsfi*liijAlteaiiafaiqepspresii ■ 

QVLPSGTPPGTMVTAPHSSTRHTSVVMLTPNPDGPPSQAAAPMA 
TPTPRAEGHPPT\TPSPPSLRQ* PPPIIiKAP/SSTGPAPAAMAT 

tsskpegrprgqaaptilltkppgatsrpttapprtttrrpprp 
pgssrkgagnssrpvppapgghsrskegqrgrnpsstplgqkrp 
lgkifqiykgwftgsvepepstltprtplwgyssspqpqtvaat 



575 



WO 01/53312 



PCTAJS00/34263 



SEQ 
ID 
NO: 


1 Predicted 

1 beginning 
nucleotide 
location 
corresponding 

I to first 
amino acid 
residue of 
amino acid 

[ sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide""" 
(^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G^Glycine, 
H=>Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, ^Methionine, N=Asparagine, 
PsProline, Q=Glutamine, RaArginine, 
S=Serine, T«Threonine, VoValine, 
W=Tryptophan, Y^Tyrosine, X= Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=poasible nucleotide insertion) 


6952 






TVPSNTSWAPTTTSLGPAKDKPGLRRAAQGGGSTFTSQGGTPDA " 
TAASGAPVSP/PSCPSAFSAPPPR*PTGWPQP**LI i AYCYP\CT 
S RPLS TS SGVFTAATGPTPAAFDTS VSAPSQGI PQGASTTPQAP 
THPSRVSESTISGAKEETVA\PSP*PTGCPVLSPQWYPQPQAIS 
STAWSPPGPGSLGQQGTSPMWPRGTNRSTBPPSA*ARWISPG*S 
WPSACPSPP\LCPADGVLHEBEEEDRQPGEQPEAyGNNTHHPGT 
TFQQAC\RGAAPGE IP VPLKPLRTQLSEPRS PANGDYRDTGMVP 
C 


6953 


658 


304 


^BSEGESGKMTDRYTIHSQLEHU)SKyiGT\ATPTPPSGSG\CT^ 
PTP RLVTiT iT iHOP'LP P Q fYT iT oumu*DrtojipnT ^ r 

ASRQARGELRLCLTTAVRGTSPSVS PVCQSS 


"6954 " 


| 1512 


349 . 


NWGKTRALASGKH VP FGKQTNPNKS / VHCDS * G* * RJRETTQDES 
PS PHPRGKMGGW\ KLEKBLENTEQ PVGGNEG * £HE VTGWLNSD 
PLLELCQCPLCQLDCGSREQLIAHVYQHTAAWSAKS YM\ CPVC 
GRALSSPGSLGRHLLIHSEDQRSNCAVCGARFTSHATFNSEKLP 
BVLNMESLPTVHNEGPSSAEGKDIAFSPPVYPAGIIiVCKrNCAA 

««»u*uunyx r «j V KAH/UIKKU NftfijlS VKI^RLERERTAKKSRRDN 

ETPEEREVRRMRDREAKRIiQRMQETDEQRARRLQRDREAMRLKR 
AIBTPEKRQARLIRERKAKRLKRRLEKMDMMLPAQF^SQDPSAMA 
ALAAEMNFFQLPVSGVELDSQLLGKMAFEEQNSSSLH 




819 


1 


PPPPFI IP3HPREAGT*AG * KRSGDSECS P PVBQ * A* TRAAAQN 
* PQR * RWTEGN S PQASAVATPGQGAS PAAPRCTP * PSRRHRRLP 
PGARP PAG* AAPAPTKPWLAG PAS A PQPGAAPLS P PAP PL I RTR 
♦CAGAAARGRPRRDRS PRPRTPGGCS WSEPRTPPAVSASAQTPS 
DAG* AGGR* GQRQRPSTGR * PPGVGGAGRSHRREGTT PGNPHPR 
^*^^QR*PGP/REWGI,+EPQGEBMSGPGGPGGAPPNQVGSS 


6955 
6956 


19*8 


782 

r 


^PGRRQVRAQVAGAPVGHWGTRARQVKTGGRRRARRTMPFLGQD 
WRSPGWS WIKTEDGWKRCES CSQKLERENNHCNI S HS 1 1 LNS ED 
GEIFNNEEHEYASKKRKKDHFRNDTNTQSFYREKWIYVHKESTK 
ERHGYCTLGEAFNRLDFSSAIQDIRRFNYWKLLQLIAKSQLTS r 
I^GVAQKNYFNILDKXVQiCVLIJDHHNPRLIKDLLQDLSSTLCIL 
/N*RSREVCISGKHQYLDLPIRNYSRIATTATGSSDD*ASE\NG 
LTLSDLPLHMLNNILYRPSDGWDI ITLG QVTPTLY MIjS EDRQLW 

KKl^QYHFABKQFCRHLILSEKGHIEWroiYFALQKHYPAKBQY 
GDTLHFCRHCSILFWKDSGHPCTAADPDSCFTPVS PQHFIDLFK 




8605 


3839 " * 

2 
I 


QTSTS I FASPTiJ PP VLGE£* VLQDNSFDLNNGSnAEQEEMETQSS 
DFPPSLTQPAPDQSSTIQLHPATSPAVSPTTSPAVSLWSPAAS 
PEISPBVCPAASTWSPAVFSWSPASSAVLPAVSLEVPLTASV 
TS PKAS PVTS PAAAFPTAS PANKDVS S FLETTADVBE ITGEGLT 
ASG SGDVMRRRI ATPEEVRLP LQHGWRRE VRI KKGSHRWQGETW 
YYGPCGKRMKQFPEVIKyLSRKVVHSVRREHFSFSPRMPVGDKF 
EERDTPEGLQWVQLSAEEIPSRIQAITGKRGRPRNTEKARTKEV 
PKVKR^RGRPPKVKITELI^nxTDNRPLKKLEAQETLNEEDKAKI 
AKSKKKMRQKVQRGECCTCTIQGQARNKRXQETKSLKQKEAKKKS 
iUVKluSKGKTKOEKLKEKVKRfiJUCEKVKM^ 

KTIJ^TQRRLEERQRQQMILEEMKKPTEDMCLTDHQPLPDFSRVP 
SLTLPSGAFSDCLTIVEFLHSFGKVLGFDPAKDVPSLGVLQEGL 
LCQGDSLGEVQDLLVRLLKAALHDPGFPSYCQSLKILGEKVSEI 
PLTRDNVSE IIiRCFLMAYGVBPAL CDRLRTQPFQAQP PQQ KAA V 
CAFLVHBLNGSTLI INEIDKTLESMSS YRKNKWI VEGRLRRLKT 
/LAKRTGRSEVEMEGPEECLGRRRSSRIMEVTSGMEEEEEEESI 
UVPGRRGRRDGEVDATASSIPELERQIEKLSKRQLFFRKKIiLH 
3SQMLRAVSLGQDRyRRRYWVLPYLAGIFVEGTEGNI,VPEEVIK 
(ETDSLKVAAHASl^PALFSMKMELAGSNTTASSPARARGRPRK 
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SEQ 
ID 
NO: 



predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



I Amino acid segment containing signal peptide" 
(JUAlanine, C«Cysteine, D=Aspartic Acid, E*= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
I^Leucine, M=Methionine, NsAsparagine , 
I P»Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, V»Valine, 
WoTryptophan, Y=Tyrosine, XoUnJcnown, *«Stop 
Codon, /=posainle nucleotide deletion, 
\opossible nucleotide insertion) 
TKPGSMQPRHLKSPVRGQDS EQPQAQLQPEAQLHAPAQPQPOLQ 
IiQWJSHKGFLEQBGSPLSLGQSQHDLSQSAPLSWIfcSQTQSHSSL 
LSSSVLTPDSSPGKLDPAPSQPPEBPEPDEAESSPDPQALWFNI 
SAQMPCNAAPTPPPAVSBDQPTPSPQQLASSKPMNRPSAANPCS 
P VQFSS TPLAGLAPKRRAGDPGEM PQS PTGLGQPKRRGR PP S KF 
PKQMEQRYLTQLTAQPVPPEMCSGWWWIRDPEMLDAMLKALHPR 
GIREKALHKHLNKHPODPLQEVCLRPSADPIFEPRQLPAFQEGIM 
SWSPKEKTYETDLAVLQWVEBLEQRV1MSDLQIRGWTCPSPDST 
REDLAYCKHLSDSQEDITWRGRGREGLAPQRKTTNPLDLAVMRL 
AALEQNVERRYLREPLWPTHEVVLEKALLSTPNGAPEGTTTEIS 
YEITPRIRVWRQTLER^SAAQVCLCLGQLBRSIAWEKSVNKVT 
CLVCRKGDNDEPLLLCDGCDRGCHIYCKRPKMEAVPEGDWFCTV 
GLAQQVEGEPTQKPGPPKRGQKRKSGYSLNFSEGDGRRRRVLIjR 
GRESPAAGPRYSEEGLSPSKRRRLSMRNHHSDLTFCEIILMEME 

shdaawpflepvnprlvsgyrriiknpmdfstmrerllrggyts 
seefaadai>lvfdncqtfneddsevgkaghimrrffe\srweep 

YQG KQGQS VRQGRWGVTLWHLP PTFQTKTCHFH HiMLP WVQTQV 
RYNPDF 



3514 



HLIVAMPEJ^rKKEENEVPAPAPPPEE PSKEKEAGTTPAKDWTLV 
BTP PGEEQAKONANSQIiS I IiFIBKPQGGTVKVGEDI TFIAKVKA 
EDLSEKPTINGSRKWMDIiASKAGKHIiQLKETFERHSRVYTFEMQ 
I IKAKDNFAGNYRCEVTYKDKFDSCSFDLEVHES1X3TTPNIDIR 
SAFKRSGEGQEDAGELDFSGLUCRREVKQOEEEPOVDVWEIiLKN 
j TKPSEYBK1AFQYESPTCSGMLKRLKRSIRBBKKSAAPAKILDP 
| VYQVDKGGRVRFVVEIiADPKLEVKWNKKGQELRPSTKYI FEDTR 
CQSILNIDNCQMTDDSEYYVTAGDEKCSTELLVREPPIMVTKQL 
1 EDTTDYCGERVELECE VSEDDAQVKWFKNGEEI ILVQTRYRIRV 
EGKKHILI IEGATKADAADYSVMTTGGQSS AKIiSVDLKPLKIIiT 
PLTDQTVNLGKEICLKCEISENIPGKWTKNQLPVQESDRIiKVVH 
KQRIHKLVIDHALTEDEGDYVFAPDAYNVTLPAKVHVIDPPKI I 
LDGL0ADNTVTVI AGNKLRLE I P ISGE PP P KAM WS RGDKA I M EG 
SGRIRTESYPDSSTLVIDIAERDDSGVYHINLKNEAGEAHASIK 
VKWDFPDPPVAPTVTEVGDDWCIMNWEPPAYDGGSPILGYFIE 
RK KKQ S S RWMRLNFDL CKETTFE PKKM I EGVAYE VR I FAVNA\ I 
GISKPSMPSRPFVPIAVTSPPTLLTVDSVTDTTVTMRWRPPDHI 
GAAGI*DG YVLEYCFEGSTSAKQSDENGEAAYDLPAEDW IVANKD 
tilDKTKFTITGLPTDAKIFVRVKAVNAAGASEPKYYSQPILVKE 
IIEPPKTHSPKHLKQTYIRRVGDRVILVIPFOGKPRPELTWKKD 
GAEIDKNQINIRNSETDTIIFIRKAERSHSGKYDLQVKVDKFVE 
TASlDIRIIDRPGPPQrVKIEDVWGRNVALTWTPPiCDDGNAAlT 
GYT I Q KADKKSMEWLRVIEHI I EP VPHTELVIGNE YYFRVFSEN 
MCGLSEDATMTKESAVXARDGiCIYKNPVYEDFDFSEAPMFTQPL 
VNRLCHSGYMATLNCLSWG^PKPKITWMKNKVAIVDDPRYIIMFS 
NQGVCTLE IRKP SP YDGGTYCCKAVNDLGTVEI ECKLEVKV IAQ 



1663 



6959 



1469 



PRTSRVXTEGSO^SSAMDFSVKVDIEKEVTCPICLEIjLTEPLSL 
DCGH5FCQACITAKIKESVIISRGESSCPVCQTRFQPGNLRPNR 
HLANI VERVKEVKMS PQEGQKRDVCEHHGKKLQIFCKEDGKVIC 
^CELSQEHQGHQTFRINEVVKECQEKLQVALQRLI KENQEAEK 
LEDDIRQERTAWKNYIQIBRQKILKGFNEMRVILDNEEQRELQK 
LEEGEVNVLDNLAAATDQLVQQRQDASTLISDLQRRLRGSSVEM 
LQDV1DVMKRSES WTLKKP KS VS KKLKS VFR VPDLSGMLQ VTiKE 
LTDVQYYWVDVMLNPGSATSNVAISVDQRQVKTVRTCTFKNSNP 
CDFSAFGVFGCQYFSSGKYYWEVDVSGKIAWILGVHSKISSIJ^K 
RKSSGFAFDPSVNYS KVYSR YRPQYGYWVIGLQNTCEYNAFEDS 
SSSDPKVIiTLFMAV\LP WLGFS 

SLVHVVEFGRGIgDFpYLFFQLTHCO^RICSVTQAGVQWCgHSS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=»Isoleucine, K=»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RaArginine, 
S=Serine, TaThreonine , V=Valine, 
WaTryptophan, Y=Tyrosine, X»Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQPQTPGIiNQSS HI#S LLS5RDYRMLSS FNEW FWQDRF WLP PNVT 
vrrBIiEDRDGRVYPHPQDLLAALPLALVLLAMRIAFERFIGLPIiS 
RWLGVRIX}TRRQVKPNATLEKHFIiTEGHRPKBPQLSLIiAAQCGL 
TbQQTQRWFRRRRNQDRPQLTKKFCEASWRPLPYLSS FVGGLSV 
LYHESWLV7APVMCWDRYPNQLTLSCPAADSEA\SLYWWYLLELG 
F YLSLL I RLP FD VKRKGG GP S S 1 KPRPHYDP PSTA\DFKEQVTH 
H PVAVILMTFS YSANI»LRIGSL VLLLHDS SD YLLEACKM VNYMQ 
YQQVCDALFLI FSFVFFYTRLVLFPTQI LYTTYYES I SNRG PFF 
G YYFFNGLLMIiIiQLLHVFWS CL I LRMLYS FMKKGQME KD IRSDV 
EESDSSEEAAAAQEPLQLKNGTAGGPRPAPTDGPRSRVAGRLTN 
RHTTAT 


6960 


387 


2068 


AKWARE KEMQE F \ TRS F F \RGR PDLS TLTHS I VRRR YLAHSGRS 
HLEPEEKQALKRLVBEBPLKMOVDEAASRRDIOiDLTKKGKRPPT 
PCSD PERKRFRPNS E S ESGS EAS S PDYFGP PAKNGVAS RS HTHP 
KE BNPRRA\ S KAVEE S SDEERQRDLPAQRGEES S EEEEKG YKG K 
TRKKPVVKKQAPGKASVSRKQAREESEESEAEPVQRTAKICVBGN. 
KGTKSLKESEQESEEE I LAQ KKEQ R EE E V EE EE KE EDEEiCG DWK 
PRTRSNGRRKSAREERSCKQKSQAKRLLGDSDSBEEQXEAASSG 
DDSGRDREPPVQRKSEDRTQLKGGKRLSGSSEDEEDSGKGEPTA 
KGS RKMARLGSTSGEE S DLEREVS DS EAGGG PQGERKNRS S KKS 
SRKGRTRSSSSSSDGSPEAKGGKAGSGRRGEDHPAVMRLKRYIR 
ACGAHRNYKKLLG3CCSHKERLSILRAELEALGMKGTPSLGKCR 
ALKEQREEAAEVASLDVANI ISGSGRPRRRTAWNPLGEAAPPGE 
LYRRTLDSDE ERPRPAP PD WSHMRG X IS SDGESN 


6961 


340 


1646 

r 


RPWSSPTMKPNFSLRLRIFNIiNCWGiPYLSKHRADRMRRLGDFL 
NQESFDLALLEEVWSEQDFQYLRQKLSPTYPAAHHFRSG1IGSG 
LCVFSKHPIQELTQHIYTLNGYPYMlHHGDWFSGKAVGLIiVLHL 
SG MVLNAYVTHLHAE YNRQ KD I YLAHRVAQAWELAQF I HHTS KK 
ADVVIiLCGDIiNMHPEDLGCCLLKEWTGIjHDAYLETRDFKGSEBG 
NTMVPKNCYVSQQELKPFPFGVRIDYVLYKAVSGFYISCKSFET 
TTGFDPHRGTPLSDHEALMATLFVRHSPPQQNPSSTHGP\AERS 
PL/MCVCLKEALDGSLGLGMA\QARWWA\TFA\SYVIGLGL\LL 
LALL C VLAAGGG AG EAA I L L WTP S VGLVLWAGAFYIiFHVQ EVNG 
LYRAQAELQHVLGRAREAQDLGPEPQLYALL\ LGQQEGDRTKEQ 


6952 


340 


1646 


RPWSSPTWKPNFSLRLRIFNX^CWGIPYLSKHRADRMRRLGDFL 
NQES PDhKLLEEVWSEQDFQYLRQKLS PTYPAAHHFRSG I IGSG 
LCVFS KHP IQELTQHI YTLNG YP YM I HHGDWFSGKAVGIiL VLHL 
SGMVIdtAYVTHIiHAEYNRQKD I YLAHRVAQAVIELAQFIHHTS KK 
ADWLLCGDLNMHP E DIK3 CCLLiKEWTG LHDA YLETRDFKG S E EG 
NTMVPKNCYVSQQELKPFPFGVR ID YVLYKAVSGFYI SCKS FET 
TTGFDPHRGTPLSDHEALMATLFVRHS P PQQNPS STHGP \ AE RS 
PL/MCVCLKEALDGSLGLGMA\OARWWA\TFA\SYVIGLGL\LL 
LAI^CVIiAAGGGAGEAAILLWTPSVGLVLiWAGAFYLFHVQEVNG 
LYRAQAELQHVLGRAREAQDLiGPE PQL YALL\ LGQQEGDRTKEQ 


6963 


374 


2618 


RVTPLILKLLKKPKTAENQKASEENEITQPGGSSAKPGLPCLNF 

LHNFSITSVIiETIiNEQRNRGHFCDVTVRIHGSMLRAQRCVXAAGS 
PFFQDKLLLGYSDIEIPSWSVQSVQKLIDFMYSGVLRVSQSEA 
LQ I LTAAS II£ I RTVI DE CTRI VSQNVGDVF PG I QDSGQDT PRG 
TPESGTSGQSSDTESGYIiQSHPQHSVDRIYSALYACSMQNGSGE 
RSFYSGAWSHHETALGLPRDHHMEDPSWITRIHERSQOMERYL 
STTPETTHCRKQPRPVRIQTLVGNIHIKQEMEDDYDYYGQQRVQ 
ILERNESEECTEDt DQAEGTESEPKGES FDSGVS S S IGTE PDS V 
EQQFGPGAARDSQAEPTQPEQAAEAPAEGGPQTNQLETGASSPE 
RSNEVEMDSTVITVSNSSDKSVLQQPSVNTSIGQPLPSTQLYLR 
QTETLTSNLRMPLTLTSNTQVI GTAGNTYLPALFTTQPAGSGPK 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

c o ixe s ponaing 

to first 

ami nn ^3 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
reaiaue ot 
amino acid 
sequence 


Anu.no acid segment containing signal peptide 
(A=Alanine, CoCyateine, D-Aspartic Acid, E= 
Glutamic Acid, P= Phenylalanine, G*Glycine, 
H=Histidine, Iolsoleucine, K=Lysine, 
L= Leu cine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, RaArginine, 
SsSerine, ToThreonine, VaValine, 
W-Tryptophan, Y«Tyrosine, X -Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








pflfslpqplagqqtqpvtvsqpglstftaqlpapqplas sagh 
stasgqgekkpyectlcnktftakqnyvkhmfvhtgekphqcsi 
cwrsfslkdylik\hmvthtgvrayqcsicnkrptqksslnvhm 

RLHRGEKSYBCYICKKKFSHKTLLERHVALHSASNGTPPAGTPP 
GARAGPPGWACTEGTTYVCSVCPAKFDQIEQFNDHMRMHVSDG 


6964 


1 


178 ' 


SGRPFFFPPSNTDVYPIKKVTNRWTAG3SYKMTRMKSIGKILLL 
QIFIG\NCSMFVLVI 


-6965— 


757 


208 


NVF I E PR1QG FM KTSAHPG Q KHPDFS MGLLFPLLAALE VCS CGS 
SGSLG YNLPQNH\GLLGRNTLVLLGQMRR1SPFLCLKDRSDFRF 
PQEKVEVSQLQKA\QAMSFLYDVLQQVFNFSHKALL\CCMEHDL 
PGPTPHFTSSAAGTPGDLLGAGDGRRRSWGQWVIEGSTLALRRY 
FQESISTLE 


696* 


820 


1867 


IITALGVP^MPGCPCPGCGMAGPPJLLFLTALALELLGRAGGSQP - ' 
ALRS RGT ATACRLDNKES BS WGALL S GERLDTW ICSLLGS LMVG 
LSGVFPLLVIPLEMGTMLRSEAGAWRX.KQLLSFALGGLLGNVFL 
HLLPEAMAYTCSASPGGEGQSLQQQQQLGLWVIAGILTFLALEK 
/HVPGQQGGGDQPGPQQRPHCCCRRAQWRPLSGPAGCRARPRCR 
GP \ D 1 KVSGYLNLIiANTIDNFTHGIiAVAASFLVS KKIGLLTTMA 
ILL1IE I PHEVGDFAILLRAGPDRWSAAKLQLSTALGGLLGAG FA 
ICTQS PKGVEETAAWVLPPT3GGFLYIALVNVLPDLLEEEDPW 


6967 


162 


633 


GFLPFKYWILDI^ASSRMBTDCNPMELSSMSGFEEGSELNGFEG 
TDMKDMRLEAEAVVNDVLFAVNNMFVSKSLRCADDVAYIFrVETK 
ERNRYCLELTEAGLKWGYAFBQVDDHLQTPYHETVYSLLDTL\ 
SPAYRRAFGKR\LLQRLEALKRDGQS 


696B 


1 


2265 


RGGGGGRGGPGARERERPGEPBRTMEAAAGGRGCFQPHPGLQKT 
LEQFHLSSMSSLGGPAAFSARWAQEAYKKESAJCEAGAAAVPAPV 
PAATEPPPVIiHLPAIQPP P PVL PGP FFMPSDRSTERCETVIiEGE 
TI SCFWGGBICRLCIjPOI lns VLRD FS LQQINAVCDELH i ycsr 
CTADQIiBILKVMGILPFSAPSCGLITKTDABRLCNALLYGGAYP 
PP CKKEXAASLALGLELS ERS VRV YHE \C FGKCKGIj \ L VP ELYS 
SPSAACICX^LD\CRIiMYPPHKPVVHSHKALENRTCHWGF\DSA\ 
NWRAYI LLS QD YTGKEEQARLGR \ CLDDVKEKFD YGNKYKRRVP 
RVSSEPPASIRPKTDDTSSQSPAPSEKDKPSSWLRTLAGSSNKS 
LGCVHP RQRL SAFRP WS PAVSAS EKEL SPHLPAL I RDS FYS YKS 
FETAVAPNVALAPPAQQKVVSSPPCAAAVSRAPEPLATCTQPRK 
RKLTVDTPGAPETIiAP VAAPEBDKDSEAEVEVES REEFTS S LSS 
LSSPSFTSSSSAKDLGSPGARALPSAVPDAAAPADAPSGLEAEL 
EHLRQALEGGLDTKEAKEKFLHEVVKMRVKQEEKLSAAIiQAJ^ 
LHQELEFIiRVAKKEKLREATEAKRNLRKEIERJJlAENEKKMKEA 
NESRLRLKRELEQARQARVCDKGCEAGRLRAKYSAQIEDLQVKL 
QHAEADRECZiRADLLRfiREAREHItBK\VVK\ELQBQLWPRARPB 
AAGSEG\AAELEP 


" 6969 


1855 


118 


AGTMHGRLKVKTS E EQAEAKRLEREQKLKL YQS ATQAVFQKRQA 
GELD ES VLELT S Q I LGAN PDFATLWNCRREVLQQLETQKS PEEL 
AALVKAELGFLESCLRVNPICS YGTWHHRCWLLGRLP E PNWTREL 

NFSNYSSWHYRS CLLPQLHPQPDSGPQGRLPEDVLLKELELVQN 
AFFTDPNDQSAW FYHRWLLGRAD PQDALRCLHVS RDEACLTVSP 
SR PLLVGSRME ILLLMVDDS PLIVEWRTPDGRNRPSHVWLCDLP 
AASLNDQLPQHTFRVIVfTAGDVQKECVLLKGRQEGWCRDSTTDE 
QLFRCELS^KSTVLQSELESCKELQELEPENKWCL\LTIILLM 
RALDPLLYEKETLQYFQTI*K\AWDP3CRATY\LDDLRSKFLLENS 
VLKMEYAEVRVLHLAHKDLTVLCHLEQLIJjVTHLDLSHNRLRTL 
PPALAALRCLEDPPPRT\ VLQASDNAI ESLDGVTNLPRLQELLL 
CNNRLQQ PAVLQPLASCPRLVLLNLQGNPLCQAVG I LEQLAELL 
PSVSSVLT 



579 



WO 01/53312 



PCTAJS00/34263 



[ SEQ 
1 ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
aroino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=I*eucine, M=Methionine, N=Asparagine , 
PnProline, Q=Glutamine, R=Arginine, 
S=Serine, T»Threonine, V»Valine, 
WaTryptophan, Y=Tyroeine, X=Unknown, *»Stop 
Codon, /.possible nucleotide deletion, 
\=possible nucleotide insertion) 


6370 


3 


152B 


SFPPLLSSPSAVGEGKVAVAAPCPGRSECARAKMAYIQLEPLNE 
GFLSRISGLLLCRWTCRHCCQKCYESSCCQSSEDEVBILGPPPA 
QTPPWLMASRSSDKDGDSVHTASBVPLTPRTNSPDGRRSSSDTS 
KST YSLTRR I S SLESRRPSS PL I D I KP IE FGVLS AKKEP I QPS V 
LRRTYNPDDyFRKFEPHLYSIiDSNSDDVDSLTDEEILSKYQDGM 
LHF S TQ YDLLHNHLTVR VI EARDL P PP I S HDG 3 R QD MAH S NP YV 
KI CLL PDQKNS KQTGVKRKTQKP VFEERYTFE I P FLEAQRRTLL 
LTWD FDKFS RHCVIGKVS VPLCEVDLVKGGHWW KALI PSSQNE 
VEIK3ELLLSLNYLPSAGRLNVDVIRAKQLLQTDVSQGSDPFVKI 
QLVHGLKLVKTKKTSFLRGTTDP FYNESPSFKVPQEELENASLV 
FTVFGHKMKSSNDFIGRIVIG\QYSSGP\SEPNHWRRMLNTHRT 
AVEQ WHS LRSRAECDRVS PASLEVT 


*971 


37 


3702 


ACFYVPGSRS FKLI PRHGLVNMGRSGKLPSGVSAKLKRWKKGHS " 
SOS NPAI CRHRQAARS RFFSRPSGBISDLTVDAVKLHNELQSGSL 
RLGKSEAPETPMEEEAELVLTEKSSGTFLSGLSDCmVTFSKVQ 
RFWESNSAAHKEICAVLAAVTEVIRSQGGKETETEYFAALIRKA 
AQHGVCSVLKGSEFMFEKAPAHHPAAISTAKFCIQEIEKSGGSK 
EATTTLHMLTLL KDLL PCF P EGL VKS CS ETLLRVMTLSHVLVTA 
CAMQAFKSLFHARPGLSTLSAELNAQIITALYDYVPSJENDLQPL 
LAWIJCVT^KAHINLVRLQWDLGIXSHLPRFFGTAVTCLLSPHSQV 
LTAATQSIiKEIliKECVAPHMADIGSVTSSASGPAQSVAKMFRAV 
EBGLTYKFHAAWSSVLQLLCVFFEACGRQAHP VMRKCLQS LCDL 
RLS PHFPHTAALDOAVGAAVTSMGPEVVIiOAVPIjE I DGSE ETLD 
FPRS WLLPVIRDHVQETRLG F FTTY FLPLANTLKS KAMDLAQAG 
STVE SKI YDTLQWQMWTLLPGPCTRPTDVAI S FKGLARTLGMAI 

serpdlrvtvcqalrtlitkgcqaeadraevsrfaknflp ilfn 
lygopvaagdtpaprravletirtyltitdtqlvnsllekasek 

VLDPASSDFTRI1SVLDLVVALAPCADEAAISKLYSTIRPYI1ESK 
AHGVQKKAYRVLEEVCAS PQGPGAL FVQSHLEDLIGCTLLDS LRS 
TSS PAKRPRLKCLEiHIVRKLSABHKEFITALIPEVILCTKEVSV 
GARKNAFAIiLVEMGHAFLRFGSNQBEALQGYLVL I YPGLVGAVT 
MVSCS IIiALTHLLFEFKGLMGTSTVEQLLENVCLLLASRTRDW 
KSAI^FIKVAVTVMDVAHIAKHVQLVMEAIGKLSDDMRRHFRMK 
LRNLFT\KFIPK\FGILTWGKiCAVGPKEYHRVLVNIRKAEARAK 
RHRALS QAAVEEEEEEEEEEEPAQGKGDS IBEILADSEDBEDNE 
EEERSRGKEQRKLARQRSRAWLKEGGGDEPLNFLDPKVAQRVLA 
TQPGPGRGRKKDHSFKVSADGRLIIREEADGNKMEEEEGAKGED 
EEMADPMEDVI IRNKKHQKLKHQKEAEEEELE I P PQYQAGGSGI 
HRPVAXKAMPGAEYKAKKAKGDVKKKGRPDPYAYI PLNRS KLNR 
RKKMKliQGQFKGLVKAAQRGSQVGHKNRRKDRRP 


6972 


2179 


973 


PGGAI LLPLWRRTRPREATVPRGAAQRGRARSAEGRI PSSQSPS " 
PAEAGGATRS PPPRPPRPARP PGPSAPPLLRSDAG PGATV5 AAA 
AAATERARRGATMGAQLSTLGHMVLFP VWFLYS LLMKLFQRS TP 
AITLES PDIKYPLRLIDRBI ISHDTRRFRFALPS PQHILGLP VG 
QHIYLSARIDGNLWRPYTPISSDDDKGFVDLVIKVYTKDTHPK 
FPAGGKMSQYLESMQIGDTIEFRGPSGLLVYQGKGKFAIRPDKK 
SNPIIRTVKSVGMIAGGTGITPMLQVIRAIMKDPDDHTVCHLLF 
ANQTE JCDI LLRPELEELRNXHSARFKLWYTkDRAP EA WDYGQG \ 
FVNEEMIRDHLPPPE\EEPLVLMCGPPPMIQYACLPNL\DHVGH 
PTERCFVF 


6973 


1 


1964 


LQPRCAHRGU^QKOSRPAPGVDAMVLCPVIGKIjLHKRVVIiASA 
S PRRQE ILSNAGLRFE WPSKFKEKLDKAS FATP YG YAMETAKQ 
KALEVANRLYQKDLRAPDWIGADTIVTVGGLILEKPVDKQDAY 
RMLSRFE/SGREHSVFTGVAIVHCSSKDHQLDTRVSEFYEETKV 
KFSELSEELLWEYVHSGEPMDKAGGYGIQALGGMIiVESVHGDFIi 
ITVVGFPLNHFCKQLVKIjYYPPRPEDIiRRSVKHDSIPAADTFBDli 



580 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - "" 
(A- Alanine, C=Cysteine, D=Aspartic Acid, E= . 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
HeHistidine, I=tsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N&Asparagine, 
P^Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X»Unknown, *«Stop 
Codon, /opossible nucleotide deletion, 
\=possible nucleotide insertion) 








SDVEGGGS E PTQRDAGSRDE KAEAGEAGQATAEAECHRTRETLP 
PFPTRLLELIEGFMLSKGIjLTACKLKVFDLLKDEAPQKAADXAS 
KVDASACGMERLLDICAAMGLLEKTEQGYSNTETANVYLASDGE 
YS LHG PI MHNNDLTWNLPTYLE FAIREGTNQHHRALG KKAEDLF 
QDAY YQS P ETRLR FMIIAMHGMTKLTACQVATAFNLSRPS S ACD V 
GGCTGAIiARELAREYPRMQVTVFDLPDI I ELAAHFQPPGPQAVQ 
IHFAAGDFFRDPLPSAELYVLCRHiHDWPDDKVHKLLSRVAESC 
KPGAGLLLVETLLDEEKRVAQRAliMQSLNMLVQTEGKERSLGEY 
QCLIiELHGFHOVQWHIjGGVLDAI L\ PPKWPPEAQAACSIi 


6974 


30B2 


2172 


RSCAAFASFASRPPLEliFAPPGSHRSPPGRGrVATSAOCAi»SVRK 
L LAARPGLGTKYQ ATMVYKTLFALCILTAGWRVQSLP TSAPLSV 
SLPTNIVPPTTIWTSSPQNTDADTASPSNGTHNNSVLPVTASAP 
TSLLPKNIS IESREEEITSPGSNWEGTNTDPS PSGFSSTSGGVH 
LTTTLEEHS LGTPEAG VAATLSQSAAE P PTL I SPOAPAS SPSSL 
STSPPEWSASVTTNHSSTVTSTQPTGAPTAPESPTBESSSDHT 
PTSHATAEPVPQEKTPPTTVSGKVMCELIDMET\PPPFPG 


6975 


2 


500 


R PRPTVHCCKWALKLETAMETI/INVFHAHSGICEGDKYKLS KKEL 
KELLQTELSGFLDVKELML*ATEALKTFEEA* KSPI IQCSSSRS 
SLPPAPQPPPYL*LSAVPFPlHLPIiPLLPPQAQKDVDAVDKVMK 
BLDEHGDGEVDFQEYWIjVAALTVACNNFFWENS 


6976 


1216 


970 


GCQL* VAYGTTENS PVTFAHPPEDTVEQKAES VGR IMPHTEAR I 
MNMEAGTIiAKI^PGELCIRGYCLrML^YWGEPQKTEEAVDQDKW 
YMTGDVATMNEOjGFCK I VGRS KDMIIRGGENI YPAEIiEDFFHTH 
PKVQEVQWGVKDDRMGEEICACIRLKDGEETTVEEI KAFCKGK 
ISH FKI PKY I VFVTNY PLT I S G KI QKFKLREQMERH LNL * IKQQ 
ACPGRLA 


" (5977 


1298 


588 


SIiFINTNLbSNQIRKTSFGMCSEPISDNTEDQKGKLKTPDFA*R 
ANKKSKHHVNGNRTVEPFPEGTQMAVFGMGCFWGAERKFWVLKG 
VYS TQVG FAGGYTSNPTYKEVCSEKTGHAB WR WYQPEHMS FE 
ELLKVFWENHDPTQGMRQGNDHGTQYRSAI YPTSAKQMEAALS S 
KENYQKVLS EHG PGP I TTD IREGQTF YYAEDYHQQYLSKKPNG Y 
CGLGGTGVS CP VGIKK 


" ^78 


3 


242 


SFPFRDSRRCGCCKGSSLRHTAVAMVKLSKEAXQRIiQQLFKGSQ 
FAIRWGF I PLVI YLGFKRGADPGMPE PTVLSLLWG 


6979 


3917 


1146 


DEARVRGEAYAAAILSRCRHWSGPPPFPPSPPDRKGtiRGTEPWE 
AG PG S GAT PG ARAMD VRRLKVNELR E ELQRRG LDTRGLKTEIiAE 
RLQAALEAEEPDDERELDADDEPGRPGHINEEVETEGGSEIiEGT 
AQPPPPGLQPHAEPGGYSGPDGHYAMDNITRQNQFYDTQVIKQE 
NESGYERRPLEMEQQQAYRPEMKTEMKQGAPTS FLPPEASQ3UKP 
DRQQFQSRKRPYEENRGRGYPEHREDRRGRSPQPPAEEDEDDPD 
DTL VAI DTYNCDLH FKVARDRS S GYP LTI EG FA YL WS G ARAS YG 
VRRGRVCFEMKINEEISVKHLPSTEPDPHWRIGWSIiDSCSTQl, 
GEEPFSYGYGGTGKKSTNSRFENYGDKFAENDVIGCFADFECGN 
DVBhS PrKNGKWMGI AFR IQKEALGGQAL YPHVL VKNCAVE FNF 
GQRAE P YCS VLPG FTP I QHLPLS ER I RGTVGPKS KAECE I LMMV 
\3U¥i\i\SjKl JL WAX JUiAAiJN Pis KKYN IXjGTNAIMDKMRVMGLRRQR 
N YAGR WD VIiI QQATQCLNRLIQ IAARKKRNYILDQTNVYGSAQR 
RKMRP FEGFQRKAI VICPTDEDLKDRTI KRTDEEGKDVPDHAVL 
EMKAN FTLPDVGDFLDEVLF IELQREEADKLVRQYNE EGRKAGP 
PPEKRFDNRGGGGFRGRGGGGGFQRYENRGPPGGNRGGFQNRGG 
GS GGGGNY RGG FNRSGGGG YSQNRWGNNNRDNNNSNNRGS YNRA 
PQQQPPPQQPPPPQPPPQQPPPPPSYSPARNPPGASTYNKNSNI 
PGSSANTSTPTVS S YS P PQS FGFFPSTFQ PS YSQ P P YNQGG YSQ 
G YTAPPP PPPPPPAYNYGS YGGYNPAP YTPPPPPTAQTYPQ PS Y 
NQYQQYAQQWNQYYQNQGQWPPYYGNYDYGSYSGNTQGGTS TQ 


6980 


1 


420 


GTRGRKTGRVAAPSTRRRTGWMQKLQTRS PAMSLS DPGLGYHPT 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{AwAlanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
Hs=Histidine, I=Isoleucine, K«Lysine, 
I*=Ueucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, RaArginine, 
S=Serine, T=Threonine , V«Valine, 
WaTryptophan, Y-Tyroaine, X-*Unknown, *=*Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








CWTLRWPPLCSLHALHVFHCLFSSRLGTPVSPRLAMDPNCSCEA 
GGSCACAGSCKCKKCKCTSCKKSCCSCCPLGCAKCAQGCI CKGA 
SEKCSCCA 


6981 


10 


1054 


PGRGPRRASLRPAFAARGVPQGGLGQAKQARTRACAALPTPHPS 
APRLLEPQGVFSLFPPPPGPWPNMILTKAQYDEIAQCLVSVPPT 
RQSLRKLKQRFP SQS QATLLS I PSQE YQKHIXRTHAKHHTSEAI 
E S YYQRYLNG WKNGAAPVLLDLANEOT YAPSLMARLI LERFLQ 
EHEETPPSKSI IKSMLRDPSQ I PDGVLANQVYQCIVNDCCYGPL 
VDCI KHAIGHEHBVLLRDLLLBKNLS PLDBDQLRAKG YDKTPDF 
ILQVPVAVEGHIIHWI ES KAS FGDE CSHHAYLHDQ FWS YWNRFG 
PGLVIYWYGFIQELDCNRERGILLKACFPTNIVTLCHSIA 


6582 


153 


1285 


FPQQDCS APAAPGLAG SE PRRLRAYRRRRQRARGLKRVAWLAP P 
PSLLQGLQGWAQAPVDGTLGPEDSRASSPMIQNSRPSLLQPQDV 
GDTVETLMLHP VIKAFLCGS I SGTCSTLLFQPLDLLKTRLQTLQ 
PS DHGS RRVGMLAVLLKWRTBSIiLGLWKGMS PS I VRCVPGVG I 
YFGTLYSLKQ YFLRGHP PTALES VMLGVGSRS VAGVCMS P I TVI 
KTR YESGKYG YES I YAALRS I YHS EGHRGLFSGLTATLLRDAP F 
SGI Y1MFYNQTKNXVPHDQVDATL1 P ITNFSCGI FAG ILASLVT 
QPADVIKTHMQLYPLKFQWIGQAVTLIPKDYGLRGFFQGGIPRA 
LRRTLMAAMAWTVYEEMMAKMGLKS 




82 


773 


BMSFLQDPSFFTMGMWSIGAGALGAAALAIiLIiANTDVFbSKPQK 
AALEYLEDIDLKTLBKEPRTFKAKELWEKNGAVIMAVRRPGCFL 
CREEAADIiSSLKSMLDQLGVPLYAVVKEHIRTEVKDFQPYFKGB 
IFLDEKKKFYGPQRRKMMFMGFIRLGVWYNFFRAWNGGFSGNLE 
GEGFII^GWWGSGKO^ILLEHREKEFGDKVNLLSVLEAAKMI 
KPQTLASEKK 


6984 


184S 


1282 


GGRS AYSLPAGS LP RVPATAAAKMAS GVQVAD EVCRI FYDMKVR"" 
KCSTPEBI KKRKKAVIFCLSADKKCX IVEEGKE ILVGDVGVTI T 
D P F KHFVGMLP E KDCR YAL YD AS FETKESRKEELMFFLWAPELA 

PLKSKMIYASSKDAIKKKFQGIKHECQANGPEDLNRACIAEKLG 
GSLIVAFEGCPV - 




1887 


1324 


RRTAGX YP CF PKPGRTRHALCS WLLliLTGQLAFDD FQBS CAMM 
WQKYAGSRRSMPLGARILFHGVFYAGGFAIVYYLTQKFHSRALY 

yklaveqlqshpeaqealgpplnihylklidrenfvdivdaklk: 
ipvsgsksegliiyvhssrggpfqrwhiidbvflelkdgqqi pvfk 
lsgengdbvkkb 


6986 


642 


1350 


YHLY FKMGD PNSRKKQALNRIjRAQLRKKKE^liADQ FD FKM YI AF 

vfkekkkksalfevsevipvmtnnyeenilkgvrdssyslessl 
ellqkdwqlhapryqs mrrdvigctqbmdfilwprnd ieki vc 
llfsrwkesdbpfrpvqakfefhhgdybkqflhvlsrkdktgiv 

VNNPNQS VFLF I DRQHLQTP KNKATI FKLCS ICLYLPQEQLTHW 

avgtibdhlrpympe 


6987 


1623 


341 


leaabkasrafkesqrqtdsknyetejowspqksqrrydmyntac 
flge 1 evglytiqilqltpffhkenels kkhmvqflsgkwti p p 
dprnbcyij^skftshlkni^sdlkrcfdffidymvllkmrytq 
keiabimi^krvsrcfrkytelfchldpcllqskesqllqeenc 
rkklealradrfaglleylnpnykdattmbs ivne yafllqqn5 
kkpmtne kqns i lani ils cl kpns kliq plttlkkqijre vlq f 

VGLSHQYPGPYFLACLLFWPENQELDQDSKLIEKYVSSLNRSFR 
GQ YKRMCRS KQAS TLFYLGKRKGLNS I VK KAKIEQYFD KAQNTN 
SLWHSGDVWKKNEVKDLLRRLTGQAEGKLISVEYGTEEiaiaPV 
ISVYSGPLRSGRNI ERVSFYLGFS IEGPPGL 


6988 


3 


689 


TQLLRRPAVFVGSAASGIRSGLWSASSGHWCAPAAGRAHAPVPR 
LVRGLGAASTAAPQDAQTGPQ PM PRADC I MRHLPYFCRGQWRG 
FGRGS KQLG I PTANF PEQWDNLPAD I STG I YYGWAS VGSGDVH 
KMVVSIGWNPYYKNTKKSMBTHIMHTFKEDFYGEILNVAIVGYL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

(As Alanine. CwCvfite^na rv_a • o _ 
. 1 \.-v-ytji.eine , u=Aspax c x c Acid./ E=2 

Glutamic Acid, F= Phenyl alanine, G=Glycine, 

H«Histidine, I=Isoleucine, K=Lysine, 

L=Leucine, MeMethionine, N=Asparagine, 

P=Proline, Q=Glutamine, R=*Arginine, 

S=Serine, T=Threonine, v=Valine, 

^Tryptophan, Y -Tyrosine, X=Unknown, *«Stop 

Codon, /^possible nucleotide deletion, 

\opossible nucleotide insertion) 








aitciiui c uoiiaouj. oai UuiJliSBAiuvjuxtsijt'iSMiiiCiKEDNiFFQVS 
KSKIMNGH 


6989 


2 


1118 


LMP5DR PL5 P STHASAGSHCHAP PTTARRAF P 1 PFGS KSNMATL ' 
KDQIiIYNLLKBEQTPQNKITVVGVGAVGMACAISILMKDLADEL 

IITAGARQQEGESRLNLVQRNVNIFKPIIPNWKySPNCKLLIV 

snpvdiltyvawkisgfpknrvigsgcnldsarprylmgerlgv 

nruoutuH vjwjBiiQBa s V P VWS GMNV7AG VSLKTLHPDIiGTDKDK 
EQWKEVHKQVVESAYEVIKLKGrrSWAIGI^VADLAESIMKNLR 
R VHP VS TM I KGL YG I KDDVFLS VPCI LGQNGISDL VJCVTLTS EE 
EARLKKS ADTLWG I QXELQF 


6990 


71$ 


2£>8 


THASGMAS WLALRTRTAVTSLLS PTPATALAVRYAS KKSGGS S 
KNLGGKSSGRRQG IKXMEGHYVHAGNI IATQRHFRWHPGAHVGV 
GKNKCLYALEEGIVRYTKEVYVPHPRNTEAVDLITRLPKGAVLY 


6991 
6992 * 


169 


4S1 


RRSSDFHNPGFLSRPVSLRENIHHQVICSTKNkRRNPX^SAVliL 
SS LLMTNLNPNES TENQP VDAYWAFTLDQE FLTYACVEGTGCL F 
CGRHVH 




944 


510 


RQAPGCSSLALRQVRQVYCGLVRAPQVQTRPLSSRFVERRGAIjY 

RSPMNQENP PPYPGPGPTAPYPP YPPQPMGPGPMGGP YPP PQGY 

PigGYPQYGWQGGPQEPPKTTVYVVEDQRRDELGPSTCLTACWT 
ALCCCCLWDMLT 


6993 
£994 * 


1 


374 


QWCVTCPQHNARQGPAVPP6lOAYdAAPFl^LQ\^FTE^KcR^ 
DRVWIK^TVASLCPLWKGPQTVVLSPPTAVKVEGIPAWIHHSH 
VKPAARETWEARPS PDNPFRVTLKKTTSPAPVTPGS 




346 


1100 


QWPEKDPV^UVASSISSPWGKHVFKAILMVLVAI^IL^iHSAI^ 

RDFAPPGQQKREAPVDVLTQIGRSVRGTLDAWIGPETMHIiVSES 

SSQVI^AISSAISV7AFFAIiSGIAAQLLNALGIiAGDYIAQGLKLS 

PGQVQTFLLWGAGALVVYWLIiSLLUSLVIiALLGRILWGLKLVlF 

LAGFVALMRSVPDPSTRALIiLLALLILYALLSRI,TGSRASGAQL 

EAKVRGIiERQVEELRWRQRRAAKGARSVEEE T 


6995 


144 


1344 


GS VA VGLSG I MAAQKDLWDAI VIGAG I QG CFTAYHLAKHRKR IL 
LLEQFFLPHS RGSSHGQSRI IRKAYLEDF YTRMMHECYQ I WAQL 
EHEAGTQLHRQTGLLLLGMKENQELKTIQANLSRQRVEHQCLSS 
EEL KQRFPNI RLPRGEVGLLDNSGGVT YAYKALRALQDAr RQLG 
GIVRIK3EKWEINPGLIiVTVKTTSRSYQAKSLVITAGPWTNQLL 
RPLGIEMPLQTLRIXJVCYWREMVPGSYGVSQAFPCFLWLGLCPH 
HI YGLPTGE YPGLMKVS YHHGNHAD PEERDCPTARTD IGD VQI L 
SS FVRDHLPDLKPEPAVIESCMYTNTPDEQFILDRHPKYDNIVI 

GAGFSGHGFKLAPWGKILYELSMKLTPSYDZiAPFRISRFPSLG 
KABti 


6996 


543" 


1942 


MDLFTKYYSEWKGGRKNTNEFYKTIPRFYYRLPAENEVLLQKLR 
EBSRAVFLQRK5RELLDNEELQNLWFLLDKHQTPPMIGEEAMIN 
YEN P LKVGE KAG AKCKQF FTA KVF AKLLHTDS YGR I S I MQ F FNY 
VMRXVWLHQTR I GLS LYDVAGQG YLRES DLENY I LELI PTLPQL 
DGLEKSFYSFYVCTAVRKFFFFLDPLRTGKIKIQDILACS FLDD 
hbEhRDEELSKES QETNW FS APSALR VYGQYLNLDKDHNGMLS K 
E ELS RYGTATMlTIVFLDRVFQECLTYt)GEMDYKT YLDFVLALEN 
RKEPAALQ YI F KLLDIENKG YLNVFS LNYFFRAIQELMKI HGQD 
PVSFQDVKDEIFDMVKPKDPLKISLQDLINSNQGDTVTTILIDL 
NGFWTYENREALVANDSENSADLDDT 


6997 


370 


1104 


AMELTIFILRLAX YILTFPLYLLNFLGLWSWI CKKWFPYFliVRF " 
TVI YNBQMASKKRELFSNLQE FAGP S GKLS LLEVG CGTGAN FKF 
YPPGCRVTCIDPNPNFBKFLIKSIAENRHLiQFERFVVAAGENMH 
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ID 
NO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, CoCysteine, D=Aspartic Acid, E= 
Glutamic Acid, P«Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LoLeucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, RaArginine, 
S=Serine, T»Threonine, V=»VaIine, 
W=*Tryptophan, Y» Tyrosine, X- Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QVADGSVDWVCTLVLCSVKNQERILRSVCRVLRPGGAFYFMEH 
VAAE CSTWNY FWQQ VLDPAWH LLFDGCNLTRE S WKALERAS FS K 
LKLQHIQAPLSWELVRPHIYGYAVK 


6998 


2 


616" 


FVSRALLRVRSRRHPAEERAAPGRPEDAPIBCPGATNCPEPtiWC 
SHLPVP YAPPTMESRGKSAS S PKPDTKVPQVTTEAKVPPAADGK 
APLTKPSKKEAPAEKQQPPAAPTTAPAKKTSAKADPALLNNHSN 
LKPAPTVPSSPDATPEPKGPGDGAEEDEAASGGPGGRGPWSCEN 
FNPIiLVAGGVAVAAIAIilLGVAFLVRKK 


6999 


14 


1591 


GRAGACSRRDTAMSIEIESSDVIRLIMQYLKENSLHRAIiATLQE 
ETTVSLKTVDSIESPVAClNSGHWDTVLQAIQSLKXiPDKTLIOL 
YEQVVLEIiIELREIjGAARSliljRQTDPMIMLKQTQPERYIHLENXt 
LARSYFDPREAYPDGSSKEKRRAAIAQALAGBVSWPPSRLMAL 
LGQALKWQQHQGLLPPGMTIDLFRGKAAVKDVEBEKFPTQLSRH 
IKFGQKSHVECARFSPDGQYLVTGSVDGFIEVMNFTTGKXRKDIj 
KYQAQDNFMMMDDAVLCMCFSRDTEMIiATGAQDGKIKVWKIQSG 
QCLRRFERAH S KG VTCLS FS KDS S QI LSAS FDQTI R IHGI* KSGK 
TLKE FRGHSS F VN EATFTQDGH Y I IS ASSDGTVKI WNMKTTECS 
NTFKSLGS TAGTDITVNSVILLPKNPBHFWCNRSMTVVTMNMQ 
GQI VRS FSSGKREGGDFVCCAIiSPRGEWI YCVGEDFVLYC FSTV 
TGKLE RTLTVHEKDVI G IAHHPHQNLIATYSEDGLLKLWKP 


7000 


2 


827 


GPGWFLELMESEGPPESERSEFFSQREEENEEEEAQEPEETGP 
KNPLLQPALTGDVEGLQKI FEDPENFHHEQAMQLLLEED I VGRN 
LLYAACMAGGSDVZRAIiAKYGVl^EiNEKTTRGYTLLHCAAAWGRL 
ETL KAL VELD VD I EALN FRE ERARD VAAR YSQTE C V E F L D WAD A 
RLTLKKYIAKVSLAVTiyrEKGSGKLIJCEBKNTILSACRAKNEWL 
ETHTEAS INELF^QRQQLEDI VTPIFTKMTTPCQVKSAKSVTSH 
DQKRSQDDTSN 


7001 


2056 


844 


RRCLIIAFLKGCFIFIYFlFIFETEFLSCCPGWSAVAQSRilAN 
FASQVQAI FILP KDSQVGPDVKSEAAPKRALYESVFGSGE I CGP 
TSPKRLCIRPSEPVDAVVWSVKHDPLPLLPEANGHRSTNSPTI 
VSPAiVSPTQDSRPNMSRPLITRSPASPLNNQGIPTPAQLTKSN 
APVHIDVGGHMYTSSLATLTKYPESRIGRLFDGTEPIVLDSLKQ 
HY FI DRDGQM FR Y I LNFLRTS KLL IPDDFKDYTLLYEE AKYFQIj 
QPMLLEMERWKQDRETGRFSRPCECLWRVAPDLGERITLSGDK 
SLXEEVFPEIGDVMCNSVNAGWNHDSTHVIRFPLNGYCHLNSVQ 
VLERLQQRGFEIVGSCGGGVDSSQFSEYVIiRRELRRTPRVPSVI 
RIKQEPLD 


7002 


1043 


498 


PMPSSTRWTTS * TYTDTS S AWACRPTTGTCT* TAAPGPTVRWWP 
TPCSRHQSRRRLTCWCSTSRPCGR*GGLCVRTAPTRPTTSASSS 
SWTSAGTSWPAGRRTGTATSGTATTTSVWPGCGTRMWSTQWSSV 
PRSRSCCSRPATTPPSKPGAPHAPCASSRHLAHGLAPSSPGLPA 
RGAEVC 


7003 


818 


61 


QGRFRAFCWQRDFLQPPGMRLSALLAIiASKVTLPPHYRYGMS PP 
GSVADKRKNPPWIRRRPVWBPISDEDWYLFCGDTVEILEGKDA 
GKQG KWQ VI RQRNWVWGGLNTH YR YI GKTMDYRGTMI PSEAP 

FPRAIX3IVPETWIDGPKDTSVEDALERTYVPCLKTLQEEVMEAM 
GIKETR\NTRRSIGIEPGABQLLPNFCPSLEG 


7004 


121 


2285 


FLLPVLTSRSLRQPAVPHARLGGVEPAAMKSARAKTPRKPTVKK 
G\ PKRTLKTQLG/ YYCRVRPLGFPDQECCIEVINNTTVQLHTPE 
GYRLNRNGDYKETQYS FKQVFGTHTTQKELFDWANPLVN0L IH 
GKNGLLFTYGVTGSGKTHTMTGSPGEGGLLPRCLDMIFNSIGSF 
QAKR Y VFKSNDRNSMDIQCEVDALLERQKREAMPNP KTSS S KRQ 
VDPEFADMITVQEFCKAEEVDEDSVYGVFVSYIEIYNNYIYDLL 
EEVPFDPINPNLHNLNCFVKIKiraNHYVAGCTEVEVKSTEEAFE 
VFWRGQKKRRIANTHLNRESSRSHSVFNIKLVQAPLDADGDNVL 



584 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C«Cysteine, DoAspartic Acid, B= 
Glutamic Acid, F= Phenylalanine, G«Glycine, 
H=Histidine, Ieisoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ToThreonine, v«Valine, 
W-Tryptophan, Y-TyroBine, X -Unknown, *-Stop 
Codon, /..possible nucleotide deletion, 
\»possible nucleotide insertion) 








QEKEQITISQI^LVDIAGSERTNRTRAEGNRLREAGNINQSIOT 
LRTCMDVLRENQMYGTNKT^PYRDSKLTHLFKNYFTCEGKVRMI 
VCVNPKAEDYEENLOVl^FAEVTQEVEVARPVDKAI CGLTPGRR 
YRNQPRGP\IGNEPLVTDWLQSFPPLPSCEILDINDEQTLPRL 
I E ALEKRHNLRQMM I DE FNKQSNAFKALLQEFDNAVLS KENHMQ 
GKLNEKEKMISGQKIiEIERLEKKJIKTLEYKIEIliEKTTTIYBBD 
KRNLQQELETQNQKLQRQFSDKRRLEARLQGMVTETTMKWEKEC 
E RR VAAKQ L EMQNKL WVKDEKLKQL KAI VTE PKTE KPB R P SRE R 
DREKVTQRSVSPSPVPVSYL 


7005 ~ 


63 


876 


RNMALYQRWRCLRLQGLQACRLHTAWSTPPRWLAbRLGLFEEI, 
W AAQ VKRLASMAQKEPRT I KI SLPGGQKI DAVAWNTTPYQLARQ 
I S STLADTAVAAQVNGEP YDLERPLETDSDLRFLTFJDS PEGKAV 
FWHSS THVLGAAABQPLGAVLCRGPSTB YGFYHDFFLG KERT I R 
GSELPVLERI CQELTAAARP FRRLBASRJDQLRQLFKDNPFKLHL 
IEEKVTGPTATVYGCGTLVDLCQGPHLRHTGOIGGLKLLSNSSS 
LWRSSG 


700* ~ 


22 


898 


NAFGRiiS TAVKMAAAA toLQWLP V I LLLLGAriP S PLS FFS AGP AT 
VAAADRS KWH I P I PS G KMYFS FGKZ LFRNTTI FLKFDG EP CDLS 
LNITWYLKSADCYNE IYNFKAEEVELYLEKLKEKRGLSGKYQTS 
SKLI^NCSELFKTQTFSGDFMHRLPLLGEKQEAKENGTNIiTFIG 
DKTAMHE P LQTWQDAP YI FI VHIG I SSSKESS KENS LSNLFTMT 
VEVKGPYEYLTLEDYPLMIFFMVMCIVYVLFGVLWLAWSACYWR 
DLLRIQFWIGAVI FLGMLEKAVFYAGFQ 


7007 


2 

•r 


1001 


AMTVSGPGTPEPRPATPGASSVEQLPJCEGNELFKCGDY'GGALAA 
YTQAIX5LDATPQDQAVLHRNRAAC!HLKLEDYDKAETEAS KAI E K 
DGGDVKAL YRRS QALE KLGRLDQAVLDLQRCVSLEPKNKVFQEA 
LRNIGGQIQEKVRYMSSTDAKVEQMFQILLDPEEKJGTEKKQKAS 
QNLWLAREDAGAEKI FRSNGVQ LLQRLIi DMGETDLM LAALRTL 
VGICSEHQSRTVATXSILGTRRWSILGVESQAVSIiAACHLLQV 
MFDALKEG VKKGFRGKEGAX IVGE WKQVWGLIiDVTVMEGMGLS Q 
PGQFFGDQTCSCRLFGIRFGDI ILL 


7008 


70 


1478 


CRSALGHERPPPAHLPAGGRRLQTCPRSCRWliGRPPSGLtPPGPR 
SPPPLAGPGQKMVQKKPAELQG FHRS FKGQNPFBLAFSIiDQPDH 
GDSDFGLQCSARPDMPASQPIDI PDAKKRGKKKKRGRATDSFSG 
RFEDVYQLQEDVLGEGAHARVQTCINLITSQEYAVKIIEKQPGH 
IRSRVFREVEMLYQCQGHRNVLELIEFFEEEDRFYLVFEKMRGG 
S IIjSHIHKRRHFNELEASWVQDVASALDFLHNKGIAHRDLKPE 
NILCEHPNQVS PVKICDFDLGSG I XLNGDCSPISTPELLTPCGS 
AEYMAPEWEAFSEEAS IYDKRCDLWSLGVILYILLSGYPPFVG 
RCGSDCGWDRGEACPACQNMLFES IQEGKYEFPDKDWAHISCAA 
KDLISKLLVRDAKQRLSAAQVLQHPWQGCAPENTLPTPMVLQR 
WDSHFLLP PHPCRIHVRPGGLVRTVTVNE 


• 7009 


1 


626 


ARQLRNSWVDDFVAAPLI PLSQQI PTGNSLYES YYKQVD PAYTG 
RVGASBAALFLKKSGLSDI ILGKI WDLADPEGKGFLDKQGFYVA 
LRLVACAQSGHEVTLSNLNLSMPPPKFHDTSS PLMVTPPSAEAH 

GRVWDLSDIDKDGHLVRDEFAVAMHLVYRALE 


7010 


79 


571 


SHTRRAWPBTLLSPLCPLLGGGTAMSGGEQKPERYYVGVDVGT " 
GSVRAALVDQSGVLLAFADQPIKNWEPQFNHHEQ3SEDIWAACC 
WTKKWQG IDLNQ I RGLGFDATCSLWLDKQFHPLP VNQEGDS 
HRNVI M WLDHRAVS QVNR INETKHS VLQ YVGG 


7011 


3 


994 


riqtlpnqnqsqtqpllktppavlqpiapqttfgvqtqpqpqsl'" 

IiQAQISAASITPLLQTQPQPLLQQPQQKAGLLQPPVRIVSQPQP 
ARRLDPPSRFSGRNDRGDQVPNRICDDRSRERERERRRSRERSPQ 

rkrsrersprrerersprrvrrvvprytvqfskfsldcpscdmm 
elrrryqnlyipsdffdaqftwvdafplsrpfqlgnycnfyvmh 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCyeteine, D»Aspartic Acid, B= 
Glutamic acio, r*= Phenylalanine, G=Glycine, 
H^Histidine, I=Ieoleucine, K=Lysine, 
L^Leucine, M=Methionine, NeAsparagine , 
P=Proline / Q=Glutamine, R«Arginine, 
SaSerine, T=Threonine, V=Valine, 
W=Tryptophan. Y-Tyrosine, X- Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








REVESLEKNMAILDP P DADHL Y S AKVMLMAS PS M EDL YHKS CAL 
ABDPQELRDGFOHPARLVKFLVGMKGKDEAMAIGGHWSPSLDGP 
DPEKDPSVLIKT\AIRCCKALTG 


7012 


1 


' 2661 


RRAGSVKRGEARLFGPTEROSEkPLRPSAARRPSMLSGKKAAAA 
AAAAAAAATGTEAG PGTAGGSENQS EVAAQ PAGLSG PAEVGPGA 
VGERTPRKKBPPRASPPGGIiAEPPGSAGPQAGPTVVPGSATPME 
TGIAETPEG\ RRTSRR KRAKVE YRBMDBSLANLSEDE YYSEE ER 
NAKAEKEKKLP PP P PQAPPEEENES E PEE PSG VEGAAFQSRL PH 
DRMTSQEAACFPDI I SGPQQTQKVFLFIRNRTLQLWLDNPKIQL 
T F E ATLQQ LK AP YNS DTVLVKR VH3 YLERHGIi INFO 1 YKR I KPL 
PTKKTGKVI I IGSGVSGIAAARQLQSFGMDVTLLEARDR VGGRV 
AT FR KGNYVADLGAMVVTGLGGN PMAVVfi KQVNMELAKI KQKCP 
LYEANGQAVPKEKDEMVEQEFNRLLEATSYLSHQLDFNVLNNKP 
VSLGQALEWIQLQEKHVKDEQIBHWKKIVKTQEELKELLNKMV 
NTL KEKI KBLHQQ YKKAS EVKPPRDI TAB FLVKS KHRDLTALCKE 
YD E LAETQG KLEE KLQBLEANP PSDVYLSSRDRQ I LDWHFANLB 
FANATPLSTLS LKHWDQDDDFE FTGS HLTVRNG YSCVP VALAEG 
LDI KLNTA VRQ VR YTASGCEVIAVNTRSTSQTFI Y KCDAVLCTL 
PLGVLKQQPPAVQFVPPLPEWKTSAVQRMGFGNLNKWLCFDRV 
PWDPSVNLFGHVGSrTASRGELFLFWNLYKAPILLALVAGEAAG 
IMENISDDVIVGRCIiAILKGIPGSSAVPQPKETWSRWRADPWA 
RGSYSYVAAGSSGNDYDIiMAQPITPGPSIPGAPQPIPRLFFAGE 
HT I RNYPATVHGALLSGLREAGR IADQFLGAMYTLPRQATPG V P 
AQQSPSM 


■"7013" " 


1 

r 


2661 


RRAGSVKRGKARLFGPTERQSERPLRPSAARRPEMLSGKKAAAA 
AAAAAAAATGTEAG PGTAGGSENG S EVAAQPAGLSG PAE VG PGA 
VGERTPRKKE PPRAS PPGGLAEPPGSAGPQAGPTWPGSATPME 
TG I AETPEG \RRTSRRKRAKVEYREMDESLANLSEDEYYSEEER 
NAKAE KEKKLP P P PPQAP PE EENES E PEEP SGVEGAAFQS RLPH 
DRMTSQEAACFPDIISGPQQTQKVFLFIRNRTLQLWLDNPKIQL 
TFEATLQQLEAPYNSDTVLVHRVHS YLERHGLI NFG I YKRI KPL 
PTKKTGKVI I IGSGVSGLAAARQLQSFC3MDVTLLEARDRVGGRV 
ATFRKGNWADLGAMVVTGLGGNPMAVVS KQVNMELAKI KQ KCP 
LYEANGQAVPKEKDEMVEQE FNRLLEATS YLSHQLDFNVLNNKP 
VS LGQALEWI QLQEKHVKDEQ IBHW KKI VKTQEELKELLNKM V 
NLKEKI KBLHQQYKEAS EVKPPRDITAEFLVKS KHRDLTALCKE 
YDELAETQGKLEEKLQELEANPPSDVYLSSRDRQILDWHFANLE 
F ANATPLSTLSLKHWDQDDOFEFTGSHLTVRNG YS CVPVALAEG 
liD I KLNTAVRQ VR YTA S G CE V I AVNTRS TSQTF I YKCDA VL CTL 
PLGVLKQQPPAVQFVPPLPEWKTSAVQRMGFGNLNKWLCFDRV 
FWDPSVNLFGHVGSTTASRGELFLFWNLYKAPILLALVAGEAAG 
I MEN I S DDVI VGRCLAI LKGI FGSS AVPQPKBTWSRWRADPWA 
RGSYSYVAAGSSGNDYDLMAQPITPGPSIPGAPQPIPRLFFAGE 
HTIRNYPATVHGALLSGLREAGRIADQFLGAKYTLPRQATPGVP 
AQQSPSM 


7014 


3 


3950 


DFEVGDKIRILATLEDGWLEGSLKGRTGIFPYRFVKLCPDTRVE 
ETMALPQEGSUUiIPErrSIJ)CI^NTLGVEEQRHETSDHEAEEPD 
CI ISEAPTSPLGHLTSEYDTDRNSYQDEDTAGGPPRSPGVEWEM 
PLATDS PTSDPTfi WNGISS QPQVP FHPNLQKS Q YYS TVGGSHP 
HSEQYPDLLPLEARTRDYASLPPKRMYSQLKTLQKPVLPLYRGS 
SVSASRWKPRQSSPQLHNLASYTKKHHTSSVYSISERLEMKPG 
PQAQGLVMEAATHS QGDGSTDLDS KLTQQL IEFEK5 LAGPGTE P 
DKILRHFS IMDFNS EKDI VRGSSKLITEQELPERRKALRPPPPR 
PCTPVSTSPHLLVDQNLKPAP PLWRPSRPAPLPPS AQQRTNAV 
S PKLLSRHRPTCE TLEKEGPGHMGRS LDQTS PCPLVL VRI EEME 
RDIjDMYSRAQEELNLMLEEKQDESSRAETLEDLKFCESNIESLN * 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine, JULysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P« Proline, Q=Glutaraine, R=Arginine, 
5=serine, T=Threonxne, v^valine, 
W^Tryptophan, YtTyrosine, XoUnknown, *-Stop 
Codon, /^possible nucleotide deletion, 
Xapossible nucleotide insertion) 








MBLQQIiREMTIiLSSQSSSLVAPSGSVSAENPEQRMLEKRAKVIE 
ELLQTBRDYIRDLEMCIERIMVPMQQAQVPNIDFEGLFGNMQMV 
IKVSKQLLAALEISDAVGPVFLGHRDELBGTYKIYCQNHDEAIA 
LLEIYEKDEKIQKHLQDSLADLKSLYNEWGCTNYINLGSFLIKP 
VQRVMRYPLLLMELLNSTPESHPDKVPLTNAVLAVKE1NVNXNE 
YKRRKDLVLKYRKGDEDS LMEKISKLNIHS I IKKSNRVSSHLKH 
LTGFAPQIKDEVFEETBKNFRMQERLIKSPIRDLSLYLQHIRES 
ACVKWAAVSMWDVCMERGHRDLEQFERVHRYISDQIiFTNFKER 
TERLVISPLNQliLSMFTGPHKLVQKRFDKLLDFYNCTERAEKLK 
DKKTLEELQSARMJYBAL.NAQLLDELPKFHQYAQGLFTNCVHGY 
AEAHCDFVHQALEQLKPLLSLLKVAGREGNLIAIFHEEHSRVLQ 
QI^VFTFFPESLPATKKPFERKTIDRQSARKPLLGLPSYMLQSB 
ELRAS LLARYP PEKLFQAERNFNAAQDLDVSLLEGDLVG VI KKK 
DPMGS QNRWL I DNGVTKG FVYSS FLKP YNPRRSHSDAS VGS HSS 
TESEHGSSSPRFPRQNSGSTLTFNPN\S\MAVSFTSGSCQKQPQ 
DASPPPKEWDQGTLSASLNPSNSESSPSRCPSDPDSTSQPRSGD 
SADVARDVKQPTATPRS YRNFRHPE XVGYSVPGRNGQSQDLVXG 
CARTAQAPEDRSTE PDG S EAEGNQVYFAVYTF KARNPNELS VSA 
NQKLKILEFKDVTGNTE^LAEVNGiCKGYVPSNYIRKTEYT 


7015 


1842 


513 


RQAWHE\VAAPSWRGARLVQSVLRVWQVGPHVARERV1PFSSIjL 
GFQRR CVSC VAGSAPS G PRLASASR SNGQG S ALDHFI/3 F S Q PDS 
SVTPCVPAVSMNRDEQDVLLVHHPDMPENSRVLRVVLLGAPNAG 
KSTLSNQLI/3RKVTPVSRJCVHTTRCQAIjGVITEKETQVILLDTP 
GI 13 PGKQKRHHLELSLLEDPWKSMESADLVWLVDVSDKWTRN 
QLSPQLLRCLTICYSQIPSVLVMNKVDCLKQKSVLLELTAALTEG 
VVNGKKLKMRQAFHSHPGTHCPSPAVKDPNTQSVGNPQRIGWPH 
FKEI FMLSALSQEDVKTLKQYLLTQAQPGPWEYHSAVLTSQTPE 
EICANIIREKLLEHLPQBVPYNVQQKTAVWEEGPGGELVIQQKL 
LVPKESYVKLLIGPKGHVISQIAQEAGHDLMDI FLCDVDIRLSV 
KLLK 


" 701* 


167 - 


2513 


ILNAPKP^PPRDSVEAVAAKRDTGGGSWGTGMDVSGQETDWRST 
AFRQKLVSQIEDAMRKAGVAHSKSSKDMESHVFLKAKTRDEYLS 
L VARL I IH FRD IKNKKSQAS VSDPMNALQS LTGGPAAGAAG I GM 
PPRGPGQSLGGMGSLGAMGQPMSLSGQPPPGTSGMAPHSMAWS 
TATPQTQ LQLQQVAAAAAAATARSSS S SSRRRYS SSS SS SNS KQ 
FQAQQSAMQQ\QFQA\ WQQQQQL\QQ00QQQQHL IKLHHQNQQ 
QXQQQQQQU3RIAQLQLQCXJQQQQQQQQQQQQQALQAQPPIQQP 
PMQQPQPPPSQALPQQLQQMHHTQHHQPPPQPQQPPVAQNQPSQ 
LPPQSQTQPLVSOAQALPGQMLYTQPPLKFVRAPMVVQQPPVQP 
QVQQQQTAVQTAQAAQMVAPGVQVSQS S LPMLSS PSPGQQVQTP 
QSMPPPPQPSPQPGQPSSQPNSNVSSGPAPSPSSFLPSPSPQPF 
\QSPVTARTPQNFSVPSPGPLNTPVNPSSVMS PAGSSQAEEQQ Y 
LDKLKQLSKYIEPLRRMINKIDKNEDRKKDLSKMKSLLDILTDP 
SKRCPLKTLQKCEIALEKLKNDMAVPTPPPPPVPPTKQQYIiCQP 
LLDA VLANI RS P VFNHS LYRTFVPAMTAIHGPP2 TAP WCTRKR 
RLEDDERQS I PS VLQGEVARLDPKFLVNLD PSHCSNNGTVHL 1 C 
KLD DKDIi P SVP PLE LS VP ADY PAQS PLW I DRQWQYDANP FLQS V 
HRCMTSRLrliQLPDKHSVTALLNTWAQSVHOACLSAA 


7017 


1 


1785 


INLGNTCYMNSVI*AL^TDFRJ^Q\^SI^LNGCNSLMKKI^HL 
FAFLAHTQREAYAPRI FFEASRPPWFTPRSQQDCSEYLRFLLDR 
LH EEEKI UCVQASHKPS E I LE CS ETSLQE VASKAAVLTETPRTS 
DGEKTL I E KMFGGKLRTHTRCLNCRSTSQKAEAFTDLS LAFWPS 
YSLEYMSCPDCSQSPSIQDGGIiMQASVPGPSEEFVVYNPTTAAF 
ICDSLVNEKTIGSPPNEFYCSENTSVPNESNKILVNKDVPQKPG 
GETT PSVTDLLNYFLAPE ILTGDNQYYCBN CASLQNAE JCTMQ I T 
EEPEYLILTLLRFSYDQKYHVRRKILDNVSLPLVLELPVKRITS 
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SEQ " 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end' 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A«Alanine, C= Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, . 
H=Histidine, Islsoleucine, K=Lysine, 
L=Leucine, M=Methionine, NoAsparagine, 
P=Proline r Q=Glutamine, R=Arginine, 
SnSerine, ^Threonine, V-Valine, 
W=Tryptophan, YoTyrosine, X- Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








FSSLSESVJSVDWFTDLSENLWaaiKPSGTDEASCTKLVPYLLS " 
SWVHSGISSESGHYYSYARN1TSTDSSYQMYHQSEALALASSQ 
SHLLGRDS PSAVFEQDLENKEMSKEWPLPNDSRVTFTS FQS VQK 
ITSRFPKDTAYVLLYKKQHSTNGLSGNNPTSGLWINGDPPLQKE 
LMDA IT KDNKL YLQ E Q ELN ARARALQ AAS AS C S FRPNG PDDND P 
PGSCGPTGGGGGGGFNTVGRLVF 


7018 


464 


1066 


SLVFRGNTWSGEAGHHCSALFNLAAYHQLFVGTERIRAPEIIFQ 
PS I*IGEEQAGIAETLQYILDR YPKDVQBMLVQNVFIiTGGNTMYP 
GMKARMEKEU^mPFRSSFOVQLA^PVLnAWYGARDMALNHL 
DDNEVWITRKEYEEKGGEY^KEHCASNIYVPIRLPKQASRSSDA 
QASSKGSAAGGGGAGEQA 


7019 ~ 


1048 


335 


APGGFtiVTMVFPAPSP PWMLGCCSHEVTAGPPTLCKDMS ALVAA 
RMRHIPLAPGSDWRDLPNIEVRLSDGTMARKLRYTHHDRKNGRS 
SSGALRGVCSCVEAGKACDPAARQFNTLIPWCLPHTGNRHNHWA 
GLYGRLEWDGFFSTTVTOTEPMGKQGRVLHPEQHRVVSVRBCAR 
SQGFPDTYRLFGNILDKHRQVGNAVPPPLAKAIGLE I KLCMLAK 
ARE S AS AK I KE EEAAKD 


7020 


1 


2154 


FADSKRKSVLLDKIKNIiQVALTSKQQSLETAMSFVARNTFKRVR 
NG FLMR KVAVF FSNT PTRASPQLREAVLKLS DAG I T P LFLTRQE 
DRQL I NALQ INNTAVGHALVLPAGRDLTDFLENVLTCHVC LD I C 
N I D PS CGFGS WRPSFRDRRAAGS DVDIDMAFI LDSAETTTLFQ F 
NEMKKYIAYLVRQLDMS PDPKAS QHFAR VAWQHAPS ESVDNAS 
MP PVK VE FSLTD YGS KEKLVDFLSRGMTQLQGTRALG S AI EYT I 
ENVFES APNPRDLKI WLMLTG EVP EQQLEE AQRVI LQAKCKGY 
FFWLGIGRKVNIKEVYTFASEPNDVFFKLVDKSTELNEE PLMR 
FGRLLPSFVSSENAFYLSPDIRKQCDWFQGDQPTKNLVKFGHKQ 
VNVPNNVTS S PTSNP VTTTKP VTTTKPVTTTTKPVTTTTKP VT I 
INQPSVKPAAAKPAPAKPVAAKPVATKTATVRPPVAVKPATAAK 
PVAAKPAAVRP PAAAAAKP VATKP EVPR PQAAKP AAT K PATTK P 
MVKMSREVQVFEITENSAKLHWERPEPPGPYFYDLTVTSAHDQS 
LVLKQNLTVTDRVIGGLIJVGC/ITHVAWCYLRSQVRATYHGSFS 
TKKSQPPPPQPARSASSSTINLMVSTEPLALTETDICKLPKDEG 
TCRDFILKWYYDPNTKS CARFWYGGOGGNENKFGSQKECEKVCA 
PVLAKPGVISVMGT 


7021 


2 


338 


VNAVS F FPNG YAFATGS DDATCRL FDLRADQ ELLLY S HDN 1 1 CG " 
ITS VAFSKSGRLLLAGYDDFNCNVWDTLKGDRAGVLAGH DNRVB 
CLGVTDDGMAVATGS WDS FiiRIWN 


7022 


2 


856 


VYIGS FWSHPLLI PDNRKLFEAEEQDLFRDI Q S L PRNAALRKLN 
DLI KRARLAKVHAYI ISSLKKEMPS VFGKDNKKJCELVNNLAEI Y 
GRIEREHQISPGDFPNLKRMQDQLQAQDFSKFQPLKSKLLEWD 
DMLAHDIAQLMVLVRQEE S QRP IQMVKGGAFEGTLHG PFGHG YG 
EGAGEG I DD AEW WARD KPMYDE I F YTLS P VDG K I TGANA KKEM 
VRSKLPNSVLGKI WKLADI DKDGMLDDDEFAIiANHLI KVKLEGH 
ELPNELPAHLLP PSKRKVAE 


7023 


2 


748 


AMVFGG VVPYVPQ YRDIRRTQNADGFSTYVCLVIjL VANILR I LF" " 

WFGRR FES PLLWOSATM 7 T.TM r,T.MT . VT.PTP \m\r kmitt m* DDnni? 
in. uru\r cwrwjn Sfo/vAiiAJu* nuufUiiuiL 1*5 VK VANoJLiNARRRSF 

TAADS KDEEVKVAPRRS FLDFDPHH FWQWS S FS DYVQ CVLAFTG 

VAGYI TYL3 IDSALFVETLGFLAVLTEAMLGVPQL YRNHRHQST 

EGMSI KMVLMWTSGDAFKTAYFLIjKGAPLQFSVCGLLQVLVDIA 

ilgqayafarhpqkpaphavhptgt KAJj 


7024 


1207 


190 


RTGVTGWAQVWMFGGGGVtSSGEQt^MPVKPERGLGPSDGWtV 
SSRRGSPGTVLGLPFWLLTPVLVSRSIRSMLLLTRSPTAWHRLS 
QLKPPVLPGTLGGQALHLRSWLLSRQGPAETGGQGQPQGPGIjRT 
RLLITGIiFGAGLGGAWLALRAEKERLQQQKRTEALRQAAVGQGD 
FHLLDHRGRARCKADFRGQWVLMYFGFTHCPDI CPDELEKLVQV 
VRQLEAEPGIjPPVQPVFITVDPERDDVEAMARYVQDFHPRLLGL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nunlpof it?p 

k\ U w -A- w ± VIC 

location 
corresponding 
to first 
amino acid 

yo a i /3i ip nf 

JTeolQuc OI 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C»Cysteine, D=*Aspartic Acid, E» 
Glutamic Acid, F=»Phenylalanine, G=Glycine, 
HaHistidine, I=Isoleucine , K-Lysine, 
LoLeucine, M=Methionine, N=Asparagine , 
P«=Proline, Q=Glut amine, R=*Arginine, 
SaSerine, T=Threonine, V=Valine, 
WaTryptophan, Y=Tyrosine, X=0nknown, +«Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) | 








TGSTKQVAQA8HSYRVYYNAGPKDEDQDYIVDHSIAIYLLNPDG 
LFTDYYGRSRS AEQI 3 DS VRRHMAAFRSVLS 


7025 


232 


832 


ERNS P IGNNENlt * K \lHSLL)CL L!K KGDWEGNTQFOTLQDNOBECF 
KQVI RTCE KR PTFNQHTVFNLHQR LNTGDKLNE FKE LGKAF I SG 
S DHTQHQL IHTS E KPCGDKECGNT FLPDSE VI Q YQTVHTVTCKT Y 
E CKE CGKS FSLRS S LTGHKRIHTGE KP FKCKDCGKAFRFH SQL S 
VHKRIHTGEKSYECKECGKAFSCG 


""7026 " 


328 


1146 


npnpsigdikdikkaaksmldpahkshfhpvtpslVflcfifixs 
lhqallsvgvs krsntvvgneneergtpyasrfkdmpnfi ale k 
ss vxirhccdllx g vaagss dri cts s lq vqrrf kammas i grls 
hge s adll i s cnaesaigwi ssrp wvgelm ftflfgd fes p lhk 
lrkss *lprkhr*qp inavrmfldqcmdgs ialraivsei pvfe 
ekknng* kg ige if* wgctlpphywoavttnvpklsnsgkllg 
qdeqphifg 




43 


9S4 


GRRJbQQQQRPEDAEDGAEGGGKRGEAGWEGGYPEIVKENKXFElT" 
YYQELKXVP EGEWGQFMDAIjREPL P ATLR I TG YKSHAKE I LHC h 
KNKYFKELEDIiEMDGOKVEVPOPJUSWYPEBIAWHTNLSRKJLRK 
S PHLEKFHQ FLVS ETESGN I SRQEAVSM I PPLLLNVRPHH K I LD 
MCAAPGS KTTQLI EMLHADMNVPFPEGFVI ANDVDNKR CYLLVH 
QAKRLS S PC X M WNHDAS S I PRLQ I DVDGRKE 1LFYDR I LCDVP 
CSGDGTMRKN I D VWKKWTTLNSLQliHGIiQLRIATRGAEQL 


7028 


189 


608 


SRPPPEPEPGTMVEKGSDSSSEKGGVPGTPSTQSLGSKNKIRNS 
KKMQSWYSMLSPTYKQRNEDFRKLFSKIiPEAERLIVDYSCALQR 
ElLLQGRLYLSENWICFYSNIFRWETTISIQLKEVTCLKKEKTA 
KLIPNAIQ 


7029 


1343 


40 


VIiBSNTEAKQATGTSSKLRHGTGQEKGREGPRCPSGLAQIiRLWG 
/PCPHAGRETGPRA3APIPGS*GHGWHW*RKDGRGERSEGPSAIj 
SPHSPSLLNMQQAPTHVGPGMGSQRPRSSWPEQVGVGSQLSRB 
RWRA*RSLPGAAASERTEMTKERSP/RPCQGYDSSNWFTQPGKK 
TRKRNSRRNTMVSRGGGCLLYPLQSIMPE*QIiR*GAHASPPTQG 
R*GKGGPRSPLTKASGTTH I PTPFFGS I P/RPTRDSGPGTOTS \ 
AAPGQKRGHREA*QGPEPV/WGRVTTHLQGPAG*TKPLGS\RNW 
VPGPAEGEQGBGAGLEGRP * PLKGCRS TLTFS PQLSI PMVGKKP 
PEGTTASFFP\RSCHSE*RKPPPSCPHAPALSLPHPLPLPLPPIj 
PLPLPGAGT*HSARSGRPGQ9BTGSLCHNCHHCPPHCPKCSPGG 
T 


7030 


2 


521 


FVCFSAPGSGQGGKRRVNMELSAVGERVFAAEALLKRRIRKGRW 
EYLVKWKGWSQKYS rWEPEENI LQARLLAAFEERE REMELYGP K 
KRGPKPKTFLLKAQAKAKAKTYEFRSDSARG2RI PYPGRSPQDIj 
ASTSRAREGLRN \ RVCPRQRAAP APAAP \ PRRGPSGPGPRPG * G 
PGLHFPGPGGPSKHGFVPASEQHQHQOHLPRRGPSGPGPRPG 


™ 7031 


960 


59 


HCSVPGAEWPRKP PAQICPQIiTSRPHLSSPRSI*S PGCGHS PGPG 
/CKPS /RHCDELHEGPSRTAALPCGKPQPKHGVEECG/ PCP CliA 
PRRLTEPPALTVSPVGRAAPSGAL*PSGRACSACSHRLAPEAAI» 
SAAAPRPSLGSGQNASGLPAASLPPQDSSQPHKTVPSPARSVPP 
LGAQARAAP PRLW C PRALVS G * EAS PEAVS VAAGPPVPGPT PS T 
SGSTASHSRRGC* S PR * TPAP PRRDHGRS AAFEVLTAAASAQP C 
ASQGGPRPTGAGRTPSPLGLPPSRGPPAASARPFCRHPSL 


7032 


13*3 


2104 


RRPGRTEPVEPPPVPPPPRASNSKSRCR*RNLHIiAPL*QSPLRK 
SRQIGTSSLPFGRSAGERPRPAATFCLSRGGSSPVFL+PSSSSL 
EPWMKRQFGRLHS LPSTKS WQKMNS FLLTP KLDTSLMSGWR YRQR 
LPRLHTFLK3CSLQMASELAPPLPTPAPLASSLPPPPGPPPLLPV 
PLA*LSRSGILVPPNSGFSLSC\PLGDH+GSSGEVRGSCGSPPP 
HHCWVLPPPP*LLLPPR 


7033 


*89 " 


815 


RSRDCLSSSATSNRARRSKCSGPKRATPLDSGPGP*APPGPSSA " 
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SEQ 
ID 
NO: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=>Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N«Asparagine , 
P= Proline, Q=Glutamine, R«Arginine, 
StsSerine, T=Threonine, VoValine, 
W^Tryptophan, YaTyrosine, X=Unknown, *=Stqp 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMMPSSCPWRTGAIjGPSPAGSRALGRCTSSVGPGSRWLTRTSSP 
GCATRTWRTMRMEPRPLRSRMGESAPGIPAELPSAAPSGPSAPS 
AAAPSAPTTPAAAGPNTL*SRRTAEWCWPPSCSCCWGWC *SWSA 
WDWRRPPLQVSPAPSSSCRASCCWCLESIT*SSSTARSRATGAS 
SSSTCPTSRSDRGAAWTP\SPMGAPLLPCSVPLISREEALQDPR 
NPSP*GVCSGSSGHAGIALGKPPVACSVP 


7034 


92 


1942 


EDTSSMPFRLLI PLGMiCALLPQHKGAPGPDGSAPDPAHYRERV 
KAMFYHAYDS YLENAFP PDELRPLTCDGHDTWGS FSLTL IDALD 
TLL\TLFYFQILGNVSEFQRWEVLQDSVDFDIDVNASVFETNI 
RWGGLLSAKLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPA 
FQTPTGMP YGTVNLLHGVNPGETPVTCTAG IGT FI VE FATLS SIi 
TGDP VFEDVAR VALMRLWESRSD 1 GLVGNHID VLTGKWVAQDAG 
IGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDW 
YLWVQM YKGTVSMPVFQS LEAYWPG LQS LIGD IDNAMRTFLNY Y 
TVWKOFGGLPEFYNI PQGYWEKREGYPLRPEL I ES AMYLYRAT 
GD PTLLELG RDAVES IB KI S KVECG FAT I KDLRDHKLDNRMES F 
FLAETVKYL YLLFDPTNF I HNNGST FDAVITP YGECI LGAGG YI 
FNTEAHPIDPAALHCCQRLKEEQWEVEDLMRE FYS L KRSRS KFQ 
KNTVSSGPWBPPARPGTL PS PENHDQARERKPAKQKVPLLS CPS 
QP FTS KLALLGQVFLDS S * PLDNFF I FI FLRLNYNKLLLAI I KK 
K 


7035 


92 


1942 


BDTSSMPFRIiLIPI^LLCALLPQHHGAPGPDGSAPDPAHYRBRV 
KAMFYHAYDS YLENAF P FD ELRPLTCDGHDTWGS FS LTL I DALD 
TLL\ TLF YFQ ILGNVS E FQRWEVLQDS VDFDIDVNASVFE TNI 
RWGGLLSAHLLSKKAGVEVEAGWPCSGPLLRMAEEAARKLLPA 
FQTPTGMPYGTVOTiLHGVNPGErTPVTCTAGIGTFIVEFATliSSL 
TGDPVFEDVARVALMRIiWESRSDIGIiVGNHIDVLTGKHVAQDAG 
IGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDW. 
YLWVQM YKGTVSM PVFQSLEAYWFGLQSL IGD IDNAMRTFTiN Y Y 
TVWKQFGGLPEFYNI PQGYTVEKREGYPLRPE LIES AMYLYRAT 
GDP TLLELGRDAVES X E KI S KVECGFAT I KDLRDHKLDNRMES F 
FIAF/TVKYLYLLFDPTMFIHNNGSTFDAVirPYGECILGAGGYI 
FNTEAHPIDPAALHCCQRLKEEQWEVEDLMREFYSLKRSRSKFQ 
KNTVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLSCPS 
QPFTS KLALLGQVFLDS S * PLDNFFI F I FLRLNYNKLLLAI I KK 
K 


703G 


442 


761 


CLAPLFSCFQI INLHLAPSGRLRWAWLRGPGRN* LPGEGP£> t P¥ 
RNW* ERKAGCSQPC/ PAQQHHGRPPGVS PLPRDPHPTTLRPLPP 
PPPPPPPPPRRPPRNRRPG 


7037 


442 


761 


CLAPLFS CFQ 1 1 NLKLAPSGR LRWAWLRG PGRN * LPG EGP S I PT 
RNW* ERKAGCSQPC / PAQQHHGRP PGVS PLPRDPHPTTLRPLP P 
PPPPPPPPPRRPPRNRRPG 


7038 


15S 


891 


GAGAASDMSSGLRAADFPRWKRHI SEOLRRRDRLQRQAFEE iHT" 
Q YNKLLEKS DLH S VLAQ KLQAE KHDVPNRHE I S PGHDGTWNDNQ 
LQEMAQLRIKHQEELTELHKKRGELAQ\RVIDLNNQMQRKDREM 
QMNEAKIAECLQTISDLETECLDLRTKLCDLERANQTLKDEYDA 
LQITFTALEGKLRKTTEENQELVTRWMAEKAQEANRLNARE*KR 
LQEAASPAAERACRS SKGTSTSRTG 


7039 


155 


891 


GAGAASDMSSGLRAADFPRWKRHI SEQLRRRDRLQRQAFEE I I L 
QYNKLLE KS DLHSVLAQ KLQAE KHDVPNRHE I S PGHDGTWNDNQ 
LQEMAQLRI KHQE ELTELHKKRGELAQ \ RV I DLNNQMQRKDREM 
QMNEAKIAE CLQT I S DLETECLDLRTKLCDLERANQTLKDE YDA 
LQI TFTALEGKLRKTTEENQELVTRWMAE KAQEANRLNARE * KR 
LQEAASPAAERACRSSKGTSTSRTG 


7040 


34 


789 


KITPPRRPHRCSSGHGSDNSSVLSGELPPAMGKTALFYHSGGSS 
G YESVMRDSEATGSAS S AQDSTS ENSSS VGGRCRSLKTP KKRSN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

jtesponou.ny 
to first 
amino acid 
residue of 
amino acid 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Phenylalanine, G=Glycine, 
HnHistidine, I=Isoleucine, 3C=Lysine, 
LaLeucine, M=Methionine, N=Asparagine, 
P=Pxoline, Q«Glutamine, R=Arginine, 
SaSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=»possible nucleotide insertion) 








PGSQRRRLIPALSLDTSSPVRKPPNSTGVRWVDGPLRSSPRGEG' 
E P FE1 KVYE IDDVERLQRRRGGAS KEAMCFNAKLK1 LEHRQQR I 
AJE^mAiCYEWLMKELEATKQyLMLDPNKWLSEPDLEQVWELDSLE 
YLEALECVTERLESRVNPCKAHIiMMITCPDIT 


7041 


l 


5*7 


S GRVAMGRRRAPAGGSLGRALMRHQTQRSRSHRHTDS HLHTSEL 
NDG YDWGRLNLQS VTEQS S LDD PLATAELAGTE FVAE KLN I KFV 
PAEARTGLLS FEES QR I KKLHEENKQFLC I P RRPNWNQNTTPE E 
LKQAEKDNFLEWRRQL\VRLEEEQKLILTPFERNLDFWRQLWRV 
IERSDIWQIVDA 


7042 


7 


345 


P I HMAAAALRADI \ iSPLFPH iQGYLLLSASHG \ATSLHTKGAL 
PL ETTVTMYTVIPKS KYVLVKPDTQYPYSBNLDE FKRLAENSAS N 
DDLLMAEVAISDYGDKLTLELREKY 


7043 


2 


2170 


ARGMAARDSDSEEDLVSYGTGIiEPLEEGERPKKPIPLQDQTVRD 
E KGR Y KR FHGAFS GG FS AG YFNTVGS KEG WT P S TFVS S RQNRAD 
KSVLGPEDFMDEEDLSEFGIAPKAIVTTDDFASKTKDRIREKAR 
QLAAATAP I PGATLLDDLITPAKLS VGFBLLRKMGWKEGQG VGP 
RVKRRPRRQKPDPGVKtYGCALPPGSSEGSEGEDDDYLPDNVTF 
APKDVTPVDFTPKDNVHGLAYKGLDPHQALFGTSGBH FNLFSGG 
S ERAGDLGRI GLNKGRKLG ISGQAFGVGALEEEDDD I YATETLS 
KYDTVLKDEE PGDQIjYGWTAPRQ YKNQ KE S E KDLRYVG KI ItDG F 
SIA3KPLSSKKIYPPPELPRDYRPVHYFRPMVAATSENSHLLQV 

LSESAG KATPDPGTHS khqlnaskraellgetpiqgsatsvlef 
lsqxdkbrikemkqatdlkaaqlkarslaqnaqssraqps paaa 
aghcs wnmalgggtatlkasnfkp fakdpe kqkr ydefl vhmkq 
gqkdalercldpsmtewergrbrdefaraallyasshstlssrf 
thakeeddsdqvevprdqendvgdkqsavkmkmfgkltrdrfew 
hpdkllfq /rlvglpr vkrd kysvfnfltl p etas l pttqas s e 

kvsqhrgpdksrkpsrwdtskhekkedsiseflrlarskaeppk 
qqssplvnkeeehapelsan 


7044 
— 7045 — 


276 


734 

r 


evyltdefakgrkvadlyelvqyagniiprlVLLitvgwyvks 

FPQSRKD1LKDLVEMCRGVQHPLRG2.FLRNYLLQCTRNILPDEG 
EPTDEBTTGDISDSMDFVLLNFAEMNKLWVRMQHQGHSRDREKR 
ERERQELRILVGTNLVRLSQV 


1 uts 




513 


I^FKMEALSRAGQEMSLAALKQHDPYITSlADLTGQVALYTFCP 
KANQWEKTDIEGTLFVYRRSASPYHGFTrVIiRLNMHNLVEPVNK 
DLEFQLHEPFLLYRNASLSIYSIWFYDKNDCHRIAKLMADVVEE 

etrrsqqa/rsgotesqpgqwlqrpqahrhpgdaeqsqg 


7046 


3 


513 


lgfkmealsragqemsi^aalkqhdpyitsiadltgqvalytfcp 
KANQWEKTDIEGTTiFVYRRSAS pyhgftivnrlnmhnlvepvnk 
DL B FQIiHEP F LL YRNASLS I YS I WFYDKNDCHRI AKLMADWEB 

etrrsqqa/rsgqtesqpgowlqrpqahrhpgdaeqsqg 


7047 


103 


486 


qmkiekogwsegltsikgnchnfytaiskdvtykelknllnskn 
imlidvreiweileyqki pesinvpldevgealqmnprdfkeky 

NEVKPS KSDS / 1 VFS YLAGVRS KKALDTAISLGFHS YYER 


7048 


92 


627 


FFCLTLLSSWDYRHHATRRVISSPVFTMEDSGKTFSSEEEEANY 
WKDLAMT YKQRAENTQEELREFQEGS RE YEAELE TQLQQI ETRN 
RDIiLS ENNRLRMELETI KE KFE VQHS EGYRQI S ALEDDLAQTKA 

IKDQLQKYIRBLEQANDDLERAKRATDHGLSKTFE\QRLN\QAI 
EKKW 


7049 


393 


938 


KRTGS AS YGCiP p P G JLGGPATXAS VAGRCSS VG K t PARRCYEDEL 
VPVFBAVGRIYELRLMMDFDGKNRGYAFVMYCHKHEAKRAVREL 
NNYE 1 RPGRLLGVCCSVDNCRLFIGG I PKMKKREEI LEEI AXVT 
EGVLDVIVYASAADKMKNRGLRIiRGVRE P PRGCH WLGRKL IAWX 
ASSLWG 


7050 


393 


938 


KRTGSASYGGPPPGLGGPATXASVAGRCSSVGKIPARRCYEDEL 
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SEQ" 
ID 
NO: 


~~ "Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
*-» niotAuiuc, x— isaieucine» K»Lysine # 
Leucine, M=Methionine, NeAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SaSerine, T=Threonine , V=Valine, 
WaTryptophan, Y-Tyrosine, X^Unknown, *oStop 
Codon, /-possible nucleotide deletion, 
\ "^uooAUic nucieociae insertion) 








VPVPEAVGRXYEIJUiMMDFDGKNRC^A^ycWKHEAXRAVRBL 
i axnj-uiUJiAjVCCS VDNCRLFIGG I PKMKKREEI LEE IAKVT 

EGVLDVIVYASAADKMKNRGLRLRGVRBPPRGCHWLGRKLIAWX 
ASSLWG 


7051 


119 


816 


KKMNLAE I CUNAKKGRE YALLGNYDS3MVY YQGVMQQI QRHCQS 
VR D P AI KG KWQQ VRQE LLEE YEQVKS I VGTLES FKIDKP P DP PV 
SCQDEPPRDPAVWPPPVPAEHRAPPQIRR/RQSRSKTSEERNGR 
SRS PGTCRPST\ PISKSEKPSTSRDKDYRARGRDDKGRKNMQDG 
ASDGEMPKFDGAG YDKDLVEALERDI VSRNPS IHWDDI ADLEEA 
KKLLREAGVLPMWM 


7052 
" 7053 


467 


715 " 


SCPGRGKMSKbJLNPEEMTSRDYYPDSYAHFGlHEEMLKDEVRTL 
1 liCN^^YHr^KHWKDKvVLDVGSGTGILSI^AARQGPRR 




4*7 


715 


SCPGRGKMSKIiLNPBEMTSRDYYFDSYAHFGlHEEMLKDEVR"^^ 
TYRNS MYHN KHVFKDKWLDVG S GTG I LS M FAARQO PRR 


7054 
7055 


1 


1036 


GTSQRSRETDARRRSAGAEPTARLPWPAALEEWPSCPCEPLGPG 
RRCRWDAMEYDEKLARPRQAHLNPFNKQSGPRQHEQGPGEEVPD 
VTP E EALPELPPGEPEFRCP ERVMDLGLS EDHFSRPVGL FLASD 
VQQL RQ AXEE CKQVILELPEQS EKQKDAWRLI HLRLKLQELKD 
PWEDEPNIRVLLEHRFYKEKSKSVKQTCD KCNTI I WGL I QTWYT 
CTGCYYRCHSKCLNLISKPCVSSKVSHQAEYELNICPETGLDSQ 
DYRCAECRAP I / CS/DGWPSEARQCDYTGOYYCSHCHWWDLAV 
IPARWHNWDFEPRKVSRCSMRYLALMVSRPVLRLREIN 


"7056 


2 


527 


DSRK^^WRSWIJVNE/WGKKJJCLFIWLSMNVLLFWKTFLLYNQGP"^ 
EYHYLHQMLG/ALCLSRASASVLNLNCSLILLPMCRTLIAYLRG 
S QKVPSRRTRRLLDKSRTFH ITCGAT I CI FSGVHVAAHL VNALN 
FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEVVLFL 




2 


527 


DSRRVSWRSWLANE/WGKHLOJFIW^^ 

EYHYLHQMLG/ ALCLSRASAS VLNLNCSLILLPMCRTLLAYLRG 

SQKVPSRRTRRLLDKSRTPHlTCXSATICIFSGVHVAAHLVNAliN 

► FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEVVLFL 
M 


p7057 
' 70*8 ■" 


1368 


431 

r 


GtYLHVNEKXPRPTCIGDRQElTOKEWLNLENHRDQELIiHASCQA 
SGEVPS QASIiRGFFTEDE PG CFGEGENLP EALQN IQDEGTGEQL 
S PQERI SEKQIiGQHLPNPHSGEMSTMWLEEKRETSQKGQPRAPM 
AQKLPTCRECGKTFYRNSQLIFHQRTHTGETYPQCTICKKAPLR 
SSDF VKHQRTHTGEKPCKCD YCGKGFS DFSGIiRHHEKIHTGEKp 
YKCPICEKSFIQRSNFNRHQRVHTGEKPYKCSHCGKSFSWSSSL 

DKHQRSHLGKKPFQ*PVTKLSFPISISQPSHXNTQLHQEELCLR 
GYPC 




1 


469 


FSGFGAVPDAIiGCRMSDLRITEAFLYMDYLCFRALCCKGPPPAR 
PEYDLVCIGLTXSSGKTSLLSKLCSESPDNWSTTGFSIKAVPFQ 
iwu-ijw v i^ttiXsOALiw I RKYWSR YYQGSQGVI FVLDSAS S EDDLE A 
ARN*SCTQLLQHPQLCTLPFLILA 


7059 

" nncri 


1 


1178 


WPAFPRQ PAAAAMDALLGTGPRRARGCIiGAAG PTS5GRAARTPA 
APWARPSAWLECVCVVTFDLELGQAIjELVYPNDFRLTDKEKSSI 
CYLSPPDSHSGC3W3DTOFSFRMRQCGGQRSPWHADDRHYNSRAP 
VALQREPAHYFGYVYFRQVKDSSVKRGYFQKSLVLVSRLPFVRL 
FQALLSLIAPEYFDKLAPCLBAVCSEIDQWPAPAPGQTLNLPVM 
GVWQ VR I PSRVDKS ESS P PKQ FDQENLLPAP WLAS VHELDLF 
RCFRPVLTHMQTLWELMLt^E PLLVLAPS PDVS SEMVLALTSCL 
QPLRFCCDFRP YFT IHDS EFKEFTTRTQAP PNWLG VTNPFF I K 
TLQHWPHILRVGEPKM9GDLPKQVKLKKPFKV* RPWDTKP 


/wow yu 


1670 


S VWIjP PS L WP WEEAMDS TKS E PLKGS PEAEDGN I E YKKL VNPSQ 
yRFEHIiVTQMKWRLQEGRGEAVYQ IGVEDNGLLVGLAEEEMRAS 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C«Cysteine, DoAspartic Acid, E= 
Glutamic Acid, F«=Phenylalanine, G=Glycine, 
H=Histidine, Iolsoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, V-Valine, 
W«Tryptophan, Y=»Tyrosine, X- Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«=possible nucleotide insertion) 








LKTLHRMAEKVCIADITVLRERETO 

Q0F LDLR VAVLGNVD SG KS TLLGVLT QGE LDNGRGRARLNLFRH 
LHEIQSGRTSSISFEILGFNSKGEVHGINGTQWGQTLRMGW* + + 
RT*DGGRVWRLFEIV*MNALRGL*TSSAPLRKSMGNQLN*IKNG 
VK1 KRQGHPGKGLG PONS EG VGRAGRRH * G P WALGQ VVNY SDS R 
TAEEICESSSKMITPIDLAGHHKYLHTTIFGLTSYCPDCALLLV 
SANTGIAGTTREHLGIiAliALKVPFFIWSKIDIiCAKTTVERTVR 
QL BR VLKQ PG CHKVPMLVTSEDDAVTAAQQ FAQS PNVTPI FTLS 
SVSGESLDLLKVFLNILPPLTNSKEQEELMQQLTEFQVDEIYTV 
PEVGTWGGTLSR*IDLLATLPTQPSPIYSKTSWPKGGDPGI 


7061 


364 


710 


ARMPSPLGPPCIiP VMDPETTIjEE PETARLRFRGFCYQEVAGPRE 
AIJUUjREIiCCQWLQPEAHSKEQMLEMLVLEQFLGTLPPEIQAWV 
RGQRPGSPEEAAAIiVEGLQHDP*ARMPSPLGPPCLPVMDPETTIi 
EEPETARLRFRGFCYQEVAGPREAXARLRELCCXJWLQPEAHSKE 
QMLEML VLEQ FLGTLP PE IQAWVRGQRPGS PEEAAAL VEGIiQHD 
PGQLLG 


7062 


71 


744 


AKAGTNLERLHWLSYFFCIPKHKLKSSQKDKVRQFMACTQAGER 
TAIYCLTQNEWRLDEATDSFFQNPDSLHRESMRNAVDKKKLERL 
YGRYKDPQDENKIGVDGIQQFCDDLSLDPAS ISVLVIAWKFRAA 
TQCEPSRKEFLDGMTEIiGCDSMEKLKALLPRLEQELKDTAKFKD 
F YQ FTFTFAKNPGQKGLDL* MAGAYWKL VLS GRFKFL YL WNTFL 
MEHH 


7063— 


2 


562 


LRTVPDLPGRRFRAMRTGQRR * PBLPPDMNSLEQAEDLkAFERR 
LTEYIHCLQPATGRWRMLLIWSVCTATGAWNWLIDPETQKVSF 
FTSLWNHPFFTISCITLIGLFFAGIHKRVVAPStlAARCRTVLA 
EYNMSCDDTGKLILKPRPHVQ*QSSLIVMGLKIAFLRISDTAKS 
HKGFLLRIiDM 


7064 


300 


884 


RDTGSDPSSTRRLCSTCCTGH* PAE P IAS PH PSRGTCP PAS S AS 
SRRTGCWTCPPESGHAQARRSRRAS AS RWGARGAVRS A VAARGC 
SSRAGRWLETPGRRRGP PACAAAAGRLRGPAP * AAPPTASVPAR 
CRCPAARTGAPAAATWLRRRLSGLRAPASGRRR6PGPSPKSAAP 
PLLTPLGAGRAGGSRANS 


70^ 


i ; 


555 


ATTTHSARRSGRGAAAEAAASAAGGRQKGPDRKAWEGRRTTPGG 
RSQSEPKAPPPQKRSEAAFASMAHSPVAVQVPGKQNNIADPEEIi 
FTKLERIGKGSFGBVFKGIDNRTQQWAIKI IDLEEAEDEIEDI 
QQEITVLSQCDSSYVTKYYGSYLKGSKLWIIMEYLGGGSALDLL 
RAGPFDEFQ 


7066 


356 


676 


PGPQRGPWRAREGGHPLDPADHPRAPASIiRSNVRAATMMQICDT 
YNQKHS LFNAMNR F I G AVNKMDQTVMVP SLLRDVPLAD P GLDND 
VGVEVGGSGGCLEERTPP 


706T 
• 


152 

■ 


973 


KEN ITMATE IGS P PRFFHMPRFQHQAPRQLFYKRPDFAQQQAMQ 
QLTFIX3KRMRKAVNRKTIDYNPSVIKYLENRIWQRDQRDMRAIQ 
PDAG Y YNDL VP P IGMLNNPMNAVTTKFVRTS TNKVKCP VFWRW 
TPEGRRIiVTGASSGBFTiWNGLTFNFETILOAHDSPVRAMTWSH 
NDMWMLTADHGG YV KYWQSNMNNVKMFQAHKEAI REARF IHN I P 
FS WP I VMVKLFSKCI LGAEMHGLCQFLGNFLHPI NTI FFFVFT 
H3PFCWAPF 


7068 


222 


816 


OTMKEYVLLLFIJUjCSAiCPFFSPSHIAIiK^r^KDMSm'DDDDD 
DDDDDDDDDDEDNSLFPTREPRSHFFPFDLFPMCPFGCQCYSRV 
VHCS DLGLTS VP TNIP FDTRM LDLQNNKI KE I KEND FKGLTS L Y 
GLILNKMKLTKIHPKAFLTTKKliRRLYLSHNQLSEIPLNLPKSL 
AELR I HENKVKKI QKDTFKKK 


7069 


114 7 


1765 


FRDHRRYFYVNEQSGESQWEFPDGEEEEEESQAQENRDETIiAKQ 
TLKDKTGTDSNSTESSETSTGSLCKESFSGQVSSSSLMPLTPFW 
TLLQSNVPVLQPPLPLEMPPPPPPPPESPPPPPPPPPAPKMPPP 
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SEQ 
ID 
NO: 


Predicted 

Hf>rji Tin i ner 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

mini 

location 
corresponding 
to first 
amino acid 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, CaCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F* Phenylalanine , G=Glycine, 
HoHistidine, I»Ieoleucine, Koltyeine, 
L=»Leucine, M=Methionine, N»Asparagine , 
P=Proline, Q=Glu t amine , R=»Arginine, 
S»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, XsDnknovm, *-Stop 
Codon, /^possible nucleotide deletion, 
\opossible nucleotide insertion) 








BKTKKGRKDKAKKSKTKMPSLVKKWQSIQREliDteEDNSSSSBED* 
RVSTAQKRIEBWKQQQLVSGMAERNANFBA 


7070 


1 


547 


DGTMEDSBAVQ RATAL I EQRIAQEBENEKLRGDARQKLPMJDIiLV 
LEOE KHHGAQS AAIjQ KVKGQBRVRKTSLDL RREI IDVGG XQNLI 
ELRKKRKQKKRDALAASHEPPPEPBBITQPVDEBTFLKAAVEQK 
MKVIEKFLADGGSADTCDQFRRTALHRASLEGHMEILEKLLDNG 
ATVDPQ 


— Vnri — 


2 


921 


ARGTLRALETAKKVGKVGANGQKAAGPSAbSVTENKIGS P PKTP 
VSNVAATSAGPS NVGTELNSVPQKS SPFLTRVPAYPPHS ENIQY 
FQDPRTQIPFEVPQYPQTGYYPPPPTVPAGVAPCVPRFVRSNNV 
PESSLPPASMPYADHYSTFSPRDRMNSSFYQPPPPQPYGPVPPV 
PSGMYAPVYDSRRIWRPPMYQRDDIIRSNSLPPMDVMHSSVYQT 
SLRERYNSLDGYYSVACQPPSEPRTTVPLPREPCGHLKTSCEEQ 
IRRKPDQWAQYHTQKAPLVSSTLPVATQSPTPPSTLNRGEGS 


fvf<i 


2 


921 


ARGTLRALETAKKVGKVGANGQKAAdPSADSVTENKIGSPPKTP 
VSNVAATSAGPSNVGTBLNSVPQKSSPFLTRVPAYPPHSENIQY 
FQDPRTQIPFBVPQYPQTGYYPPPPTVPAGVAPCVPRFVRSNNV 
PESSLPPASMP YADHYSTFS PRDRMNSS PYQPPPPQPYGPVPPV 
PSGMYAP VYDSRRI WRPPMYQRDDI IRSNSLPPMDVMHSS VYQT 
SLRERYNSLDGYYSVACQPPSEPRTTVPLPREPCGHLKTSCBEQ 
IRRKPDQWAQYHTQ KAPLVS STLPVATQS PTPPSTLNRGEGS 


^73 


50 


504 


LAHGSFaVSDFPAPAAAPAHTLTSFSGSLSPQFRKPLGRAPAMP 
LVR YRKWI LG YRCVGKTSLAHQFVEGEFSEGYD PTVENTYSKI 
VTLGKDEFHLHLVDTAGQDE YS ILP YS F I IGVHGYVLVYSVTSL 
HSFQVI ES LYQKLHEGHGK 


7074 


263 


lb03 


VCP VLCSTRQEPGHSSLVTYFG KPTRRKEFLLGHClAAG KMWI S 
VDLETOYAELVLDVGRVTLGBNSRKKMKDCKLRKKQNERVS RAM 
CALLNSGGG VI KAE I ENED YS YTKDGIGIiDUSNS FSNI LLFVPE 
YLDFWQNGNYFLIFVKSWSLNTSGLRITTLSSNLYraDITSAKV 
MNATAALE FJaKDMKKTRGRIiYLRPEIJaAKRPRVD I QEENNMKAL 
AGVFFDRTELDRKEKLTPTE5THVEI 


7075 


598 


1005 


NYINFFFRKEYPPHVQKVBINPVRLSRLQGVERlMkKTEESBSQ 
VEPE I KRKVQQKRHCS TYQ PT P PLSPAS KfCdiTKLBDLQRNCRQ 
AITLNESTGPLLRTS IHQNSGGQXSQNTGLTTKKFYGNNVEKVP 
IDII 




279 


1049 


LQSESSNAAEGNBQRHEDEQRSKRGGWSKGRKRKKPLRDSNAPK 
SPIiTGYVRFMNERREQLRAKRPBVPFPEITRMLGNEWSKLPPEE 
KQRYXiDEADRDKERYMKELEQYQKTEAYKVFSRKTQDRQKGKSH 
RQDAARQATHDHEKETEVKERS VFDI PI FTEEFLNHS KAREAEL 
RQLRKSNKEFEERNAALQKHVESMRTAVEKLEVDV1QERSRNTV 
LQQHLETLRQVLTS 8 FASMPLPEXGETPTVDTIDS YM 


Ion 


3 


1119 


S SMGSNSE I NGLALRKTDKYGFLGG SQ YS GSLKSS IP VDVARQR 
EliKWLDMFSNWDKWLSRRFQKVKLRCRKG IPSSLRAKAWQYLSN 
SKELLBQNPRKFEELERAPGDPKWLDVIEKDLHRQFP FHEMFAA 
RGGHGQQDLYRI LKAYT I YRPDEG YCQAQAP VAAVLLMHMPAEQ 
AFWCLVQICDKYLPGYYSAGLEAIQIjDGBIFFAIiLRRASPLAHR 

hlrrqr idpvlymtb wfmci eartlp wasvlrvwdmffceg vki 
i frvalvllrhtix3sveklrscqgmyetmeqlrn^ 
lvhevtnlpvtealierenaaqlkkwretrgelqyrpsrrlhgs 
raiheerrrqqpplgpsss 


7078 


483 


767 


FQGQRMAGEQKPSSNLLEQFtLIAICGTSGSALTALISQVLEAPG 
VYVFOELLEIiANVQETiAEGANAAYLQIiR^PAYGTYPDYIANKE 
SLPELY 


7079 


2 


376 


SWEFKRPKEPSGSDGESDGPIDVGQEGQLSQMARPLSTPSSSQ 
NQAR KKRRG 1 1 EKRRRDRINSS L3ELRRLVPTAFEKQGS S KLEK 
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loco t ion 
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to first 

residue of 
amino acid 
sequence 


Predictea end. 
nucleotide 
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amino acid 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C»Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidxne, I»Isoleucine, KoLysine, 
L» Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, Vo Valine, 
^Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








AEVLQmTOHLKMLHATGGTGTHALLPQASFIQQI P 


7080 


200 




VQLPLEAPCLSLLSCRDHSGGNRDLSRRHRDCRVYGSPt3DGiiPY 
LTHPLCHODVVSVGRLQIRALATPGHTQGHLVYLI/DGEPYICGPS 
CLFSGDLLPLSGCGEFPRKREBLGBEGBTSVRAATVPWRALKP ' 


7081 


213 


506 


AVTEEEWILNSLSLCYHNKLILAPMVRVGTLPMRLLAIiDYGADI *~ 

VYCEELIDLKMIQCKRWNEVLSTVDFVAPDDRWFRTCEREQN 

RWFQMGTS 


7082 


3 


1137 


APSRNTMId*UWCRGPVI^CLRQGLGra^ ~ 
CCRSSPRDLRDGEREHEAAQRKAPGAESCPSLPliSISDIGTGCL 
SSLENLRLPTLREESSPRELEDSSGDQGRCGPTHQGSEDPSMLS 
QAQSATEVEERHVSPSCSTSRERPFQAGELItAETGEGETKPKK 
LFRLNNFGLLNSNWOAVPFQKI VGKFPGQ I LRSSFGKQYMLRRP 
ALBDYVVliMKRGTAITFPKDINMILSMMDINPGDTVLEAGSGSG 
GMSLFLSKAVGSQGRVISFBVRKDHHDLAKKNYKHWRDSWKIjSH 
VEEWPDNVDFIHKDISGATBDI KSLTFDAVALDMLNPHVTLPVF 
YPHLKHGGVCPVYWNI TQVI BLLD 


7083 


115 


541 


RSNAVQLTRMEYAMKSIiSLLYPKSLSRHVsVRTSWTQQLLSEP 

spkaprarpcrvstadrsvrkgimaysledlllkvrdtlmladk 
pfflvleedgttveteeyfqalagdtvfmvlqkgqkwqppseqg 

TRHPIiSLSHK 


70B4 


3 


522 


NSVSVS5QSRFIASVPGTGVQRSAAADMAASTAAGKQRIPKVAK- 
VKKKAPAEVQITAEQLLRBAKERELELLPPPPQQKITDEEELND 
YKLRXRKT FE DNI RKNRTVI SN W I KYAQWBE S LKE I QRAR S I Y E 
RALD VDY RN I TLWLKY AEM EMKNRQ VNHARN I WDRAI TT L 


7085" 


243 


1499 


RQLARLRRRG WRSPFGGAPMAHITINQYLQQVYEAIDSRDGASC " " 
AELVS FKHPHVANPRLQMAS PE EKC QQVLEP P YDEMFAAHLRCT 
YAVGNHDF I EAYKCQTVI VQS FLRAFQAH KE ENWAL P VMYAVAL 
DLRVFANNADQQLVKKGKS KVGDMLEKAAELLMSCFRVCASDTR 
AGIEDSKKWGMIiFIiVNQLFKIYFKINKLHUCKPIilRAIDSSNLK 
DDYS TAQRVTYKY YVGRKAMFDSDFKQAEEYLS FAFEHCHRSSQ 
KNXRMILIYLLPVKMLLGHMPTVELLKKY^ 
NLLLIJIEAIJIKHEAFFIRCGIFLILEKLKIITYRNLFKKVYLLL 
KTHQLSLDAFLVALKFMQVEDVDIDEVQCILANLIYMGHVKGYI 
SHQHQKLWSKQNPFPPLSTGC 


7086 


256 


525 


II1AARWGKQNSKLRPEVMQDLLESTDFTEHEIQEWYKGFI1RDCP" 
SGHLSMEEFKKI YGNFFP YGDASKFAEHVFRTFDANGDGT I DFR 
EF 


7087 


166 


723 


LSGS sagkvaapcvppsnhelvpittenapknvvdkgegasrgg" 

NTRKS LEDNGS TRVTPSVQ PHLQP I RNMS VS RTMEDS CELDLVY 
VTER 1 IAVS FPS TANEENFRSNLRE VAQMLKS KHGGNYLL FNLS 
ERRPDITKLHAKVLEFGWPDLHTPALEKICSICKAMOTWLNAHP 

hrcrvlhnkg 


7088 


104 


759 


gtsaaspssllemageitet<3ely6syvglvymfni,ivgtgalt - 
mp kafatagwlvsl vllvflgfms fmtttfv ieamaaanaqlhw 

k.ki»j afj iilUS Hb uu ua STAS DS D V L»l RDNYERAEKRP IliSVQRRGS 
PNPFEITDRVEMGQMASMFFNKVGVNLFYFCIIVYLYGDIAIYA 
AAVP FS LMQ VTCS ATGNDS CG VEADTKYNDTDRCWGPLRRVD 


7089 


33 


1775 


SVCWEDRYLKARMEESPLSRAPSRGGVNFLNVARTYIPNTKVEC 
HYTLPPGTMPSASDWIGIFKVEAACVRDYHTFVMSSVPESTTDG 
S P IHTSVQFQASYLPKPGAQLYQFRYVKRQGQVCGQSPPFQFRE 
PRPMDE LVTLEEADGGSDI LLVVPKATVLQNQLDESQQERNDLM 
QLKLQLEGQVTELRSRVQELERAIiATARQEHTELMEQYKGISRS 
HGEITEERDILSRQQGDHVARILBLEDDIQTISBKVLTKEVEIjD 
RLRDTVKALTREQEKLLGQLKEVQADKEQS EAELQVAQQEiniHL 
NLDLKEAKSWQEEQSAQAQRLKDKVAQMKDTLGQAQQRVAELEP 
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SEQ 
ID 
NO: 


predicted 
beginning 
nucleobide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A«Alaninc, OCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
Ii^Iieucine, M=Methionine, NsAsparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, TeThreonine, VoValine, 
WoTryptophan, Y»Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\~possible nucleotide insertion) 








LXEQLRGAQE LAASSQQ KATLLGE EIASAAAARDRT IAELHRSR 

LS AS I LRliEKAVQEBRTQNQ V F KT ELAREKD S SLVQLS ES KRE 
TBLRSAliRVLQKEKEQLQEEKQELl»EYMRKLEARLEKVADEKWN 
EDATTE DE EAAVGL S CPAALTDS EDE S PEDMRLHPMAP VS VBTQ 
ASLLLGLE 


7090 


33 


1775 


S VCWEDR YLKARMEESPLSRAPSRGG VNFLNVARTYI PNTKVEC 
HYTIiPPGTMPSASDWIGIFKVBAACVRDYHTFVWSSVPESTITO 
SP IHTSVQFQASyLPKPGAQLYQFRYVNRQGQVOGQSP PFQFRB 
PRPMDELVTLEEAIXSGSDILLWPKATVLQNQliDESQQERNDLM 
QLKX^LBGOVTELRSRVQELBRALATARQBHTELMEQYKGISRS 
HGEITEERDILSRQQGDHVARILELEDDIQTISBKVLTKEVBLD 
RLRDTVKALTREQE KLLGQLKE VQADKEQS EAELQVAQQENHHL 
mJDLKEAKSWQEEQSAOAQRLKDKVAQMKDTLGQAQQRVAELEP 
LKBQLRGAQEIAASSQQKATLliGEEIiASAAAARDRT IAELHRSR 
LBVAEVITOKLAEIjGLHLKEEKCQWSKBRAGLLQSVKABKDKII'K 
LSAEILRLEKAVQEBRTQNQVFKTELAREKDSSLVQLSES KRHX 
TELRSALRVLQKEKEQLQEEKQELLEYMRKIiEARLEKVADEKWN 
EBATTEDEEAAVGLSCPAALTDSEDESPEDMRLHPMAFVSVETQ 
ASLLLGLE 


7091 


186 


1076 


EGMLTREHRCGRS EEQELEPWPS P KKAK5GRWLRNGFKRKMEEP 

eepadsgqslvpvyiyspeyvsmcdslakipkrasmvhslieay 
alhkqmrxvkpkvasmebmatfhtdaylqhlqkvsqegdddhpd 

SIEYGLGYDCPATBGIFDYAAAIGGATITAAQCLIDGMCKVAIN 
WSGGWHHAKKDEASGFCYLNDAVLGIIaRLRRKFERILYVDLDLH 
HGDGVEDAPS PTSKVMTVS LHKFS PGFFPGTGDVSDVGLG &GRY 
YSVNVP I QDG I QDEKYYQ I CER YE P PAPNPG L 


7092 


522 


809 


KQGINEDQBBSQKPRLGEGCEPISKRQMKKLIKQKQWBEQRBliR 
KQKRKEKRlCPvK3CLERQCQMBPNSDGHDRKRVRJU)VVHSTLRXiI I 
DCSFDXLM 


7093 


454 


655 * 


NFGVSGVELAQQASMVRMS PVIAACQL VLGLLMTSLTES S IQNS 
ECPQLCVCEIRPWFTPQSTYREA 


7094 


2 


SOB 


FVRSI^GVGPASSRPCVVDLSWNQSISPFGWWAGSEBPFSF^a 
DIIAFPLQDYGG IMAGLGSDPWWKKTLYLTGGALliAAAA YLL»HE 
LLVTRKQQE IDS KDAI I LHQFARPNNGVPSLS P FCLKMETYTjRM 

adlpyqnyfggklsaqgkmpwieynhekvsgtefii 


7095 


1 


411 


iasslpkmasli^sdrvlylvqgekkvrapi^qlyfcrTUsel^ 
slbcvshevdshycpsttenmpsaeaklkknilcancpdcpgcmh 
tlstrats istqlpddpakttmkkayylacgfcrwtsrdvgmad 
ksvge 


7096 


224 


2067 


etrslavqekpsqagrrrssrisfagalfltrfllqelllnnfc 
samspapdaapapas islfdlsadapvfqglslvshapgeaiiar 
aprtscsgsgeresperkllqgpmdiseklfcstcdqtfqnhqs 
qrbhykldwhrfnlkqrlkdicpllsaldfekqsstgdlssisgs 
bdsdsaseedlqtldreratpeklsrppgfyphrvlpqnaqgqf 

LYAYRCVLGPHQDPPBEAELLLQNLQS KGPRDCWLMAAAGHFA 
GAI FQGRE WTHKTFHR YTVRAKRGTAQGLRDARGG PSHSAGAN 
LRKYNE ATL YKDVRDLLAG PS WAKALE EAGT I L LRAPRSG RS Xj P 
FGGKGAPLQRGDPRL WDI PLATRRPTFXJBIjQRVLHKLTTLHVY*E 
EDPREAVRLHSPQTHWKTVREERKKPTEEEIRKI CRDEKEALGQ 
NEBSPKQGSGSEGEDGFQVELELVELTVGTLDLCESEVLPKRRR 
RKRNKKEKSRDQE AGAHRTLLQQTQEEE PS TQS S Q AVAAPLGP l 
LDEAKAPGQPELWNALLAACRAGDVGVLKLQLAPS PADPRVI»S L 
LS APLGS GGFTLLHAAAAAGRGS WRLLLE AGADPTVQCQDH 


7097 


256 


1228 


IRTKSAATWEAWPQCGREGSRIITEPCEANAGSRQELQTERISS 
FLAAQGDQAFHSGIiETKNSNS ELPLRVGLKVAQGS PLMGGQ VS A 
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location 
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to rirst 
amino acid 
residue of 
amino acid 
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Predicted end 
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location 
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residue of 
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Amino acid segment containing signal peptide 
(AaAlanine, C« Cysteine, D-Aspartic Acid, S» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I*Isoleucine, K=Lysine, 
LoLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y^Tyrosine, X*Unknown, +«Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








SNS FSRIiH CRNANEDWMSALCPRLWDVPLHHLS I PGSHDTMTYC 
LNKKS P I SHE E S RLLQLLNKALP CI TRPVVLKWS VTQ ALD VTEQ 
LDAGVRyiiDLRIAHMLBGSBKNLHFVHMVYTTALVBDTLTEISB 
WLE RHPRE WI LACRNFEGLS EDLHEYLVACI KNI FGDMIiCPRG 
EVPTLRQLWSRGQQVIVSYEDESSJLRHHHBLWPGVPYWWGNRVK 
TEALIRYLETMKSCGR 


7098 


! 82 


956 


SSPLKRCRKVLGCWGIPSEQSLFSTLEEPRDKEIDNYCVMRLQT 
EARS G FWAPNRFP VN I CRMTAVDGDRGGSSRBTCRCRFH PSliEA 
LVLLLQDWQPGGVGICTSFU3ISWALLDYHRALRTCLPS1CPLLG 
LGSSVIYFLWNLLLLWPRVLAVALFSALFPSYVALHFLGLWLVL 
LLWVWLQGTDFMPDPSSEWLYRVTVATILYFSWFNVAEGRTRGR 
AI I HFAFLLS DS I LLVATWVTHS S WLPSGI PLQLWLP VG CGCFF 
LGLALRLVYYHWLKPS CCWKPDPDQVD 


7099 


992 


210 


LFRLAPGFIiRSLARQGYHQIWAFPFLPSGATATWPAASRSRSLA 
ARSIiPRSPARPGPNDALLGEHDFRGQGVRAQRFRFSBBPGPGAD 
GAVLEVHVPQ IGAGVSLPG I LAAKCGAEVI LSDSS ELPHCLE VC 
ROS CQMNNLPHLQWGIiT WGHIS WDI*LALPPQDI ILASDVFFEP 
EDFEDILATIYFLMHKNPKVQLWSTYQVRSADWSLEALLYKWDM 
KCVHIPLESFDADKEDIAESTLPGRHTVEMLVTSFAKDSL 


7100 


20S 


671 


ANGGFWEAAPGSEVSLPLWVPTASHSKTTALGIGSAPPPHLSVL 
FLFS FP PQLGD PLE AFP VFKKYDRNGIiNVS IE CKRVSGLB PATV 
DWAFDLTKTNMQTMYEQSEWGWKDREKREEMTDDRAWYLI AWEN 
S S VP VAFSHFR FDVERGDEVLYW 


7101 


2 


S03 


WRGGPRRAKPJLAGGAVGWVl.LVmGVHSVRAGGGRPPRAAbMKKD 
VRILLVGEPRVGKTSLIMSLVSEBFPEEVPPRAEEITIPABVTP 
ERVPTHrVDYSEAEQSDEQLHQEIfiQANVICIVYAVNNKHSIDK 
VTSRWI PLINERTDKDSRbPLILGGNKSDLVBYSR 


7102 


2 


503 


WRGGPRRAKRIAGGAVGWVljLVRGVHSVRAGGGRPPRAADMKKD 
VRILLVGEPRVGKTSlxIMSIiVSEBFPEBVPPRABBITIPADVTP 
ERVPTHIVDYSEAEQSDEQLHQEISQANVICIVYAVNNKHSIDK 
VTSRWIPLINERTDKDSRLPLILGGNKSDLVBYSR 


7103 


ii9 ■■ 


438 


GSQSSVAVNIRSGTDEESMDLMNGQASSVNIAATASEKSSS3E3 
LSDKGSELKKSFDAWFDVLKVTPBEYAGQITIiMDVPVFKAIQP 
DELS S CGWNKKEKYS SAP 


j 7104 


1670 


795 


RLWEHRSVSAGASGWGIiSSPGCLLLHPSLPEEERVDILINNAGV 
MRCPHWTTEDGFEMQFGVNHLGEAWAGAAPWVQAILPRRPPJCVL 
GF*V*VKSDLFIILNPGHFLLTNLLLDKLKASAPSRIINLSSLA 
HVAGH I DFDDLNWQTRK YNTKAAYCQS \ KLAI VLFTKELSRRLQ 
GSGVTVNALHPG VARTELGRHTG IHGS TFLQHHN\ WAHLLAAWS 
KS PRSW PAPAQHNTIiAVAEELA\ VI SG KYFDGLKQKAPAPEAED 
EEVARRLWAESARLVGLEAPSVREOPLPR 


7105 


755 


143 


GQMCRRPSPKSTSCLSMTCDLP/RGIiQDPQCIiALFRVAVDKHQA ' 
LLKAAMSGQGVDRHLFALYIVSRFLHLQSPFLTQVHSEQWOLST 
SQ I P VQQMHLFDVHNYPDYVS SGGGFG P ADDHG YG VS YI FMGDG 

MTTFHXfiSICft'?<;TVTncUT3T^2ritJTT?naT T ranter xrr\n.r^ /watmr\ 
r,A ii'n^oijfsMjj i «»* uo nt\. lA^vtl x c*ui\ \.\L>\ > v Ao L»r yAoQH* JQiR 

FRGSGKENSRHRCGFLSRQTGASKASMTSTDF 


7106 
7107 


14 
1145 


1064 
591 


GLQAGH PHPRS AS RIP EADTH \ YS KLQRAFDS IWKDHKRM FGT ' 
YFRVGFFGSKFGDIiDEQEFVYKEPAITKLPEISHRLEAFYGQCF 
GAEFVEVIKDSTPVDKTKLDPNKAYIQITFVEPYFDEYEMKDRV 
TYFEKNFNLRRFMYTTPFTLEGRPRGELHEQYRRNT^TTMHAF 
P YI KTR I S VIQKEB FVLTP I E VAIEDMKKKTLQ LAVAINQEPPD 
AKMLQMVLQG5 VGATVNQGPLE VAQVFLAE I PADP KLYRHHNKI* 
RLCFKEF IMRCGEAVEKNKRLI TADQRE YQQBLKKNYNiOjKBNIj 
RPMIERKIPELYKPIFRVESQKRDSFHRSSFRKCETQLSQGS 
*I*WLQTGKKK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A-Alanine, CoCysteine, D=Aspartic Acid, B« 
Glutamic Acid, Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Pro line, Q=Glutamine, R=Arginine, 
S=»Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion,. 
\=possible nucleotide insertion) 


7108 


1 


942 


VKVALI^TNLEQPRTESEWENSFTLKhJFXFQFVNLNSSTPYIAP 
FLGRFTGHPGAYLRLINRWRLEECHPSGCLIDLCMQMGI I MVLK 
QTWNNFMELGyPIilQNWWTRRKVRQEHGPERKI S FPQWEKDYNI* 
QPMNAYGLPDEYLEMILQFGFTTIFVAAPPLAPLLAIiLNlTI IEI 
RLDAYKPVTQWRRPLAS RAKD IGI W YGILEG IG I LS VITNAFVT 
AITSDFIPRLVYAYKYGPCAGQGEAGQKCMVGYVNASLSVFRIS 
DFENRSEPBSDGSEFSGTPLKYCRYRDYRDPPHSLVPYGYTI*QF 
WHVLAW 


7109 


964 


102 


W DQRKRNS LVPGPAHGPAQEEPWE KKBS I/jAAQBALS IQLQPKE - 
TQ PFPBCSEQVYLHFLS WTEDGPEP KDKGSLPQ P P ITEVE SQVF 
SEKLATDTSTFBATSEGTLBLQQRNPKAERLRWSPAQEES FRQM 
WIHKEIPTGKKDHECSECGKTFIYNSHLWHQRVHSGBKPYKC 
SDCGKTFKQSSNLGQHQR IHTGEKPFECNECG KAFRWGAHLVQH 
QR ZHSG EKP YECNECGKAFSQSS YLSQHRRI HS GEKP FI CKECG 
KAYGWCSELI RHRRVHARKEPSH 


7110 


96 


697 


RLDNFSGFLVEVTKEERH I VKPLYDRYRLVKQMLTRASIT PVLG 
SPSTKRRGQMLQPIIEGETAHFFEEIKBEEEDGVNLSSELGDML 
KTAVQVQSSLiCNSESOVSENQEKLALDLRLSSSRAASMPBLLEQ 
LW KARAEKKKLRKTLRE FE EAFYQQNGRNAQKE DRVP VL E EYRE 
YKK1KAKLRLLEVLISKQDSSKSI 


7111 


2 


414 


OSGLYRGPTPGQQCIWKPNSMPPDHERNFGFTQFALELNELTAE 
LKRSLPSTDTRLRPDQRYLEEGN1QAAEAQKRRIEQLQRDRRKV 
MEBJmiVHQARFFTlRQTDSSGKEWWVTNNTYWRLRAEPGYGNMD 
GAVLW 


7112 


103 


495 " 


PRCFPVADRGRLIGGLPDWTIMEGKTLNLTCTVFGNPDPEVI5T" 
FKNDQDIQLSEHFSVKVBQiAKYVSMTIKGVTSEDSGKYSINIKN 
KYGGE KI DVTVS VY KHGE K I PDMAP PQQAKPKL I PASAS AAGQ 


7113 


1 


624 


KCLRQAWHEAPSSLAFTRWCSREERAEGGGNLHRS ITRDPKPPG 
LRPSQRPMDDKKKKR5PKPCLAQPAQAPGTLRRVPVPTSHSGSL 
ALGLPHLPSPKQRAKFKRVGKEKGRPVLAGGGSGSAGTPLQHSF 
I/TEVTDVYEMEGGIXNLLNDFHSGRLQAPGKECS FEQLEHVREM 
QEKZiARLHFSZiD VCGEE BDDEEBTSDG VTEGLPEEQKK'IMADRNIj 
DQLLSNLGS CLGAL VPGGMRGG EGTY S QSHS WALG E KVGVHG S K 
SSGPLNLPRR 


7114 

• 


3 


1492 


VWEVDEQIDHYKESQDKFLWQAAFIGKETLKDESGQECK1CRKI 
IYLNTDFVSVKQRLPKYYSWERCSKHHLNFLGQNRSYVRKKDDG 
CKAYWKVCLHYNIiHKAQPAERFFDPNQRGKALHQKQALRKSQRS 
QTGEKL YKCTECG KVFIQKANLVVHQRTHTGEKP YECCECAKAF 
SQKSTL IAHQRTHTGEKPYECSECGKTFIQKSTLI KHQRTHTGE 
KPFVCDKCPKAFKSSYHLIRHEKTHIRQAFYKGIKCTTSSLIYQ 
RI HTS EKPQCSBHGKASDEKPS PTKHWRTHTKEN IYECS KCG KS 
PRGKSHIiSVHQRIHTGEKPYECSlCGKTFSGKSHLSVHHRTHTG 
EKPYECRRaSKAFGEKSTLIVHQRMHTGEKPYKCNECGKAFSEK 
SPLIKHQRIHTGERPYECTDCKKAFSRKSTIiIKHQRIHTGEKPY 
KCS ECGKAFSVKSTL I VHHRTHTGEKP YECRDCGKAFS GKS TL I 


7115 


1 ' 


947 


NAAHG YNWGLW CM Y 1 1 PPQDWLDRGDESAP I RT P AM I GCS PWD 
RETFGDIGLLDPGMEVYGGENVKLGMRVWQCGGSMEVLPCSRVA 
HIERTRKPYNNDIDYYAKRNALRAAEVWMDDFKSHVYMAWNIPM 
SNPG^FGDVSERIALRQRLKCRSFKWYLEN^PBMRVYNNTLT 
YGEVRNSKASAYCLDQGAEDGDRAILYPCHGMSSQLVRYSADGL 
LQLGPLGSTAFLPDS KCLVDDGTGRMPTLKKCEDVARPTQRLWD 
FTQSG P I VSRATGRCLEVEMS KDANFGLRLWQRCSGQKWM IRN 
WIKHARH 


7116 


866 


95 


RVRMR RNAE V I EE KLS M KS WAKFR PG EP W KG YPN ID PETDP YVT 
PGS VINNLS INTVREVDHLRDRNSGS S S S LNTTLPSTS AWS SIR 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re epond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


j Amino acid segment containing signal peptide " 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P« Phenyl alanine, G=Glycine, 
HaKistidine, I*»Isoleucine, K«Lysine, 
LsLeucine, M=Methionine, N=Asparagine, 
P=Proline, QaGlut amine, R=Arginine, 
S=Serine, ^Threonine, V^Valine, 
W=Tryptophan, Y= Tyro sine, X= Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








A5NYNVPLSSTAQSTSARNSDSKLTWSPGSVTNTSIAHELWKVP ~ 
LPP KN I TAPS RP P PGLTGQKP PLSTWDNS PLR XGGGWGN5DAR Y 
TPGSSWGBSSSGRITNWLVLKNLTPQIDGSTIjRTLCMQHGPLIT 
FHLNIjPHGKALVRYSSKEEWKAOKSLHISDLFLLTL 


7117 


695 


1261 


LLISTPGGCHPPPSSIEFTYTGAWGKALPAPHMPCAPGAIiPQGA 
FVS Q AARAI P LI»Q PS Q AAQ AEGLS Q P ARACGALCSL PW PliRNWG 
S P I LRLPGG LRTPTNDR KTRTRS AMACWARAQWDTLG PLKLSHR 
GKVCLRHPRPTGVRGGPGAAGRQGGMGTRRRGTFTSGARDPGGL 
RVKHRCQPTGHLP 


7118 


49 


1863 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEVKEISLiiQPQVE " " 

ESVLNLGKFHS I VRLVAFCPFAS SQVALENAKAVSBGWHEDLR 

LIXETHLPSKKKKVLLGVGDPKIGAAIQEELGYNCQTGGVIAEI 

LRG VRLHFHNL VKGLTDLSACKAQLGIiGHS YSRAKVKFNVKRVD 

NMIIQS ISLLDQLDKDINTFSMRVREWYGYHFPELVKI INDNAT 

YCRLAQFIGNRJtELNEDKLEKLEEliTMIXSAKAKAlLDASRSS^ 

MDISAIDLINIESFSSRWSLSEYRQSrjHTYLRSKMSQVAPSLS 

ALlGEAVGARLIAHAGSLTNIiAKYPASTVQILGAEKALFRALKT 

RGNTPKYGX.I FHSTFIGRAAAKNKGRISRYLANKCSIASRIDCF 

SB VPTS VFGE KLRBQVEERLSF YETGEI PRKNLDVMKBAMVQAE 

EAAAEITRKLEKQEKKRLKKEKKRIAALALASSEMSSSTPEECE 

EMSEKPKKKKKQKPQEVPQENGMBDPSISFSKPKKKKSFSKEEL 

MSSDLEETAGSTSIPKRKKSTPKEETVNDPEEAGHRSGSKKKRK 

FSKEEPVSSGPEEAAGKSSSKKKKKFHKASQED 


7115 


49 


1863 


PHCEPNPGAGAIWIiLHVLFEHAVGYALLALKEVEEISLliQPQVE 
ESVLNLGKFHS I VRLVAFCPFASSQVALENANAVSECVVHEDLR 
LLLETHLPSKKKKVLLGVGDPKIGAAIQEELGYNCQTGGVIAEI 
LRG VRLHFHNL VKGLTDLS ACKAQLG LG HS YS RAXVKFNVNRVD 
NMI IQS I SLLDQLDKD INT FSMRVREWYGYHFPELVKI INDNAT 
YCRLAQFIGNRRELNEIJKLEKLEELTMDGAKAKAItiEtASRSSMG 
MDISAIDLINIESFSSRWSLSEYRQSLHTYLRSKMSQVAPSLS 
ALIGEAVGARLIAHAGSLTNliAKYPASTVQIU3AEKALFRALKT 
RGNTPKYGLIFHSTFIGRAAAKNKGRISRYLANKCSIASRIDCF 
SEVPTSVFGBKLREQVEERLSFYETGEI PRKNLDVMKEAMVQAE 
EAAAEITRKLEKQEKKRLKKEKKRIJUU!»ALASSENSSSTPEECE 
EMSEKPKKKKKQKPQEVPQENGMEDPS ISFSKPKKKKSFSKEEL 
MSSDLEETAGSTS IPKRKKSTPKEBTVNDPEEAGHRSGSKKKRK 
FSKBEPVSSGPEEAAGKSSSKKKKKFHKASQED 


7120 


1991 


64 


QLGTRRCLRGDKVTNAMQDFIjVTNIiE PRF I E PQTANIjS WFKDS 
NSTTPLI FVLSPGTDPAADLYKFABEMKFSiCKLSAISJjGQGQGP 
RAEAMMRSS IERGKWVFFQNCHLAPSWMPALERLIEHINPDKVH 
RDFRLWLTSLPSNKFPVS XLQNGS KMTX EPPRGVRANLLKS YSS 
LGEDFLNSCHKVMEFKSLLLSLCLFHGNALBRRKFGPLGFNIPY 
E FTDGDLRI C I S Q L KMFL DE YDD I P Y KVLKYTAGE INYGGR VTD 
DWDRRCIMNILEDFYNPDVLSPEHSYSASGIYHQIPPTYDLHGY 
LS YIKSLPLNDMPEI FGLHDNANITFAQNETFALLGTI IQIiQPK 
S SS AGSQGREEI VEDVTQNI LLKVPEP INLQWVMAKYPVL YEES 
MNTVLVQEVIRYNRLLQVITQTLQDLLKALKGLWMS SQLELMA 
ASL YNNTVPELWS AKAYPSLKPLSSWVMDIjLQRLD FLQAW IQDG 
IPAVPWISGFFFPQAFLTGTLQNFAPJCFVISIDTISFDFKVMPE 
APSELTQRPQVGCYIHGLFLEGARWDPEAFQLAESQPKELYTEM 
AVIWLLPTPNRKAQDQDFYLCPIYKTLTRAGTLSTTGHSTNYVI 
AVE I PTHQPQRHW I KRGVAL I CALDY 


7121 


2 


546 


RPLRPHVI^LGS^rVGI^TXYGRRQFQSLDTTMRRLIPPFREASAK 
LTTLVDADAEAFTAY LE AMRLP KNT P E E KDRRTAALQEGLRRAV 
S VPLTIiAET VAS LWPALQELARCGNLACR SD LQVAAKALEMGVF 
G A YFNVL IN LRD I TDEAFKDQ IHHRVS S L LQ E AKTQ AALVLD CL 



599 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Fa Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine . 
P=Proline f Q=Glutamine, R=Arginine, 
SaSerine, T=Threonine, VoValine, 
WnTryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=poesible nucleotide insertion) 
ETRQE 


7122 


2 


546 


RPI^PWVLSI^SMVGLMTYGRRQFQSli-DTTMRRLIPPPRfeASAK 
LTTLVDADAEAPTAYXiEAMRLPKNTPEBKDRRTAALQEGLRRAV 
S VPLTLAETVASLWPALQEIARCGNLACRSDLQVAAKAL emgvf 
GAYFNVLINLRDITDr^KDQIHHRVSSLLQEAKTQAALVLDCL 
ETRQE 


7123 


1 

• 


1092 


KPAVPEARSAGTS EAGRSGAEE VS CGS VSG DGAAMRLT P RALCS 

aaqaawrenfplcgrdvarwppghmakglkkmqsslklvdci I B 

VHDARI PL SGRNPLFQETLGLKPHLLVLNKMJDIADLTBQO KIMQ 
HLEGEGtiKNVI FTNCVKDENVKQ 1 1 PMVTELIGRSHR YHRKENL 

bycimvigvpnvgksslinslrrqhlrkgkatrvggepgitrav 
mski qvserplmflldtpgv1apr ie5 vetglklalcgtvldfil 

VGBETMADYLLYTLNKHQRFGYVQHYGLGSACDNVERVLKSVAV 
KLGKTQKVKTOTGTGNVOTIQPNYPAAARDFLQTFRRGLIjGSVM 
LDUDVLRGHPRV 


7124 


2 


3B2 


Ii PLTLLLAAP FAHLLLP PGkDQS P CWHP GPALS P GTLGPL SWAM 
ANSGLOLLGYFLALGGWVGI IASTAL PQWKQS S YAGDAS I QLRS 
KVFVLESEWGGDSLGLPRDCGWSCLLHSAVRSEKGFWS 


712S 


166 


1127 


NCISEKRNYSFSMQKGKGRTSRIRRRKLCGSSESRGVNESHKSE 
FIELRKWLKARKFQDSNLAPACFPGTGRGLMSQTSLQEGQMI IS 
LPBSCLLT\RDTVIRSYLGAYITKWKPPPSPLLALCTFLVSEIGJ 
AGHRSLLEA\ YLEI LPKAYTCPVCLEPEVVNLLPKS LKAXAEEQ 
RAHVQEFFASS RDFFSSLQPLFAE AVDS I FS YS ALLWAW CTVNT 
RAVYI*\ S PGSGKtAFLQSRTPVQLAP YLDLLNHS PHVQVKAAFNE 
ETHSYEIRTTSRWRKHEBVFICYGPHDNQRLFLBYGFVSVHNPH 
ACVYVSRGWNQLCS 


7126 


1 


733 


CRDMAAFI VPSPARRCSQKGSLGHLPTQPWLWAAMS PRGQERGT 
SHSQAREPQRPGRWLLGSLQSSPGTLGQAGTASRRRGCMVQRWrV 
OVATGRRAVQVPKGALGIALGETSPGASRGMSGGAGGCWALGWA 
PSPVLPSWLLBGPPPWLSIISDSGTQRPSPRRCPARPSPWGPQC 
MRGGRIASAEASST*TPGSGSRARSGRRSPGSRRRSASAPSPTP 
PTDACA* SCVAR PAGSRSSR PAAA 


7127 


1311 


277 


GbPAMCST*KAOYYEETEGDCIPKDR*IEKRPFKEI*RRIPRlF " 
AKQKQI +S*NSQKIGASEIDRGRKEADCSDAPAAARIGAVSVFR 
RSTQEARVSPRSNAKSANLRAVRAD* WEHF VLLFHT P EQFT*AEC 
ICRST* *K*WHQLC*PLSSL*TGliKRKLLL*VLFRI *WLKDCDV 
* FCQKI FATNFCNWQNLIQ*EE* KPVEYSVEN* HIMNLLLPM* L 
CQSSLRDQTIVTWRM^RNYSMFRINMlSSIi* DGSIHI PLKLHFY 
PALIFTLTVPINSCCQRPLPLFAHQSIKTLASSGSPMLACIiRFL 
LVKKRAFIHTPRSPGCSV* CKHVLVKDNKNNCVGSE V 


7128 


2 


5228 


GRVDLWTILLGRSAliRELSQIEAELNKHWRRLLEGLSYYKPPS p 
SS AEKVKANKDVAS PLKEIK3LRI SKFLGLDEEQS VQl.LQCYLQE 
DYRGTRDSVKTVLQDERQSQALILKIADYYYEERTCILRCVLHL 
LTYFQDERHPYRVEYADCVDKLEKELVS KYRQQPEEL YKTEAPT 
WETHGNLMTERQVSRWFVQCLREQSMLLEI IFLYYAYFEMAPS0 

GMDI ESIiHKCALDDRRELHQFAQDGL I CQDMDCLMLT FGD I PHH 
APVLLAWALLRHTLNPEETSSWRKIGGTAI QLNVFQYLTRLLQ 
SliASGGNDCTTSTACMCVYGLLS FVLTSLEliHTLGNQQDI IDTA 
CE VLAD PS LPELFWGTEPTSGLG I ILDS VCGMFPHLLS PLLQL L 
RAL VSGKS TAKKVYS FLDKMS F YNEL YKHXPHDVT SHEDGTL WR 
RQTPKLLYPI/3GQTNLRIPOGTVGQVMLDDRAYLVRWE YS YS S W 
TLFTCE IEMLLHWS TADVI QHCQR VKP I IDLVHKVI STDLS I A 
DCLLPITSRI YMLLORLTTVI S PP VDVI ASCVNCLT VLAARN PA 
KVWTDLRHTG FL PFVAHPVSS LSQM IS AEGMNAGGYGNLLMNS E 
QPG^EYGVTIAFLRLITTLVKGQLGSTQSQGLVPCVMFVLKEML 
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SEQ 
ID 
NO: 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, MsMethionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possihle nucleotide insertion) 








PSYHKWRYNSHGVREQIGCLILELIHAILNLCHETDLHSSHTPS 
IiQ PLC I CSLAYTEAGQTVINIMGIGVDT IDMVMAAQPRSDGAEG 
QGQGQLLIKTVKLAFSVTNNVIRLKPPSNVVSPLEQALSQHGAH 
GNNL IAVLAKYI YHKHDPALPRLAI QLLKRLATVAPMS VYACLG 
NDAAAIRDAFLTRLQSK\IE\DMRIK\VM1L\EFLTVA\VETQP 
GLIELFLNLEVKDG\SDGSKEFSIK3MW\SCLHAV/VWELIDSQQ 
QDRYWCPPLLHRAAJAFLHALWQDRRDSAMLVLRTKPKFWENLT 
SPLFGTLSPPSETSEPSILETCAHMKIICLEIYYWKGSLDQP 
LKDTLKKFS I EKRFAYWSGYVKSLAVHVABTEGSS CTS LLEYQM 
LVSAWRMLLI IATTHADIMHLTDSWRRQLFLDVLDGTKALIiliV 
PASVNC1JILGSMKCTLLLILLRQWKRELGSVDEILGPLTEILEG 
VLQADQQLMEKTKAKVFS AFITVLQMKEMKVS DI PQ YS QLVLNV 
CETLQEEVIALFDQTRHS LALGSATE D KDS ME TDDCS R SRHRDQ 
RDGVCVLGLHLAKELCEVDEDGDS WLQVTRRLP I LPTLLTTLEV 
SLRMKQNLHFTEATLHLLLTLARTQQGATAVAGAGITQSlCLPIi 
LSVYQLSTNGTAQTPSASRKSLDAPSWPGVYRLSMSLMEQLIiKT 
LRYNFL PEALD FVG VHQERTLQ CLNAVRTVQS LACLEEADHTVG 
FILQLSNFMKEWHFHLPQLMRDIQVNLGYLCQACTSFLHSRKMr. 
QHYLQNKNGDGLPSAV\ AQRV\QRPPS AASAAPS S S KQPAADTE 
ASEQQALHTVQ YGLLKI LSKTLAALRHFTPD VCQ ILLDQSLDfcA 
E YN FLFALS FTTPTFDS EVAPSFGTLLATVNVALNMLGELDKKK 
E PLTQAVG LS TQAEG TRTLKS LliMFTMENCFYLL I SQ AMR YLRD 
PAVHPRDKQRMKQELS SELSTLLS SIiS R YFR RGAPSS PATGVLP 
SPQGKSTSLSKASPESQEPLIQLVQAFVRHMQR 




1 


1054 


FRRFRWRRRLH*AGPASSAGGSPGEASGTMSGEIiPPNINlKEPR 
WDQSTFIGRANHFFTVTDPRNILLTNEQLESARKIVHDYRQG IV 
PPGLTENELWRAKYI YDSAFHPDTGEKMI LIGRMSAQVPMNMTI 
TGCMMTFYRTTPAVLFWQWINQS FNAWNYTNRSGDAPLTVNEL 
GTAYVSATTGAVATALGLNALTKHVSPLIGRFVPFAAVAAANCI 
NI PLMRQRELKVGI PVTDENGNRLGESANAAKQAITQVWSRIL 
MAAPGMAIPPFIMNTLEKKAFIiKRFPWMS*a»IQVGLVGFCLVFA 
TPLCCALF PQKS SMS VTSLEAELQAKI QESHPELRRVYFNKGl* 


7130 


2 


780 


HE VPS LQTS DPLPGS VQRCS WVSQPNKENW CQDHL YNS LGRKG 
ISAKSQP YHRSQSSSS VLINKSMDS INYPSDVGKQQLLSLHRSS 
RCES HQDLLPDI ADSHQQGTE KLSDLTLQDS QKWWNRKLPliN 
AQIATQNYFSNFKETDGDEDDYVEIKSEEDESELELSHNRRRKS 
DSKFVDADFSDNVCSGNTLHSLNSPRTPKKPVNSKLGLSPYLTP 
YNDSDKLNDYLWRGPS PNQQNIVQSLREKFQCLSSS S FA 


7131 


805 


573 


AAAEGHI E WKFLIEACKVNP FAKDR WGNI PLDDAVQFNHLE W 
KLLQDYQDSYTI*SETQAEAAAEALSKENL3SMV 


7132 


1420 


1087 


I DMLLLSGALVSGP YTL ITTAVSADLGTHKSLKGNAHALS TVTA 
I IDGTGSVGAALGPLLAGLLS PS GWS NVFYMLM FADACALL FL I 
RL IHKELS CPGS ATGDQ VP FKEQ 


7133 


2 


3648 


QQIPGLLPAHGESGDALRKPRLQKPITGHLDDtFFTLYPSLEKF 
E E ELLELHVQDHFQEGCGPLDGGALE I LERRLR VGVHNGLGFVQ 
RPQWVL VPEMDVALTRSAS FSRKWSS S KTSSGS QAL VLRS Rl> 
RL PEMVGH PAFAVT FQLEYVFSS PAGVDGNAAS VTS LSNLACMH 
M VRWAVWNPLLEADS GRVTT iPT iflftf3 T n DTJ D QUPT^tv vrro e jv e m Q 

SEEVKQVESGTLRFQFSliGSEEHLDAPTEPVSGPKVERRPSRKP 
PTSPSSPPAPVPRVLAAPQNSPVGPGLSISQLAASPRSPTQHCli 
ARPTSQLPHGS OAS PAQAQE FPLEAGI SHLEADLS QTSLVLE TS 
IAEQLQELPFTPLHAPIWGTQTRSSAGQPSRASMVLLQSSGFP 
EILDANKQPAEAVSATEPVTFNPQKEESDCLQSNEMVLQFLAFS 
RVAQD CRGTS WPKTVYFTFQF YRFPPATT PRLQLVQLDE AGQ PS 
SGALTHI LVPVSRDGTFDAGS PGFQLR YM VGPGFLKPGERRC FA 
R YLAVQTLQI D VWDGDS LLL IGS AAVQMKHLLRQGR PAVQAS HB 
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SEQ 
ID 
170: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=*Arginine, 
S=Serine, T=Threonine, V«Valine, 
W«Tryptophan, Y=Tyrosine, X-Unknown, +-Stop 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 








LEWATEYEQDNMWSGDMLGFGRVKPIGVHSWKGRLHLTLAN 
VGHPCEQ KVRGCSTLP PSRSRV I SNDGAS R FSGGS LLTTGSS RR 
KHWQAQKLADVDSBIiAAMIXTHARQGKGPQDVSRESDATRRRX 
LERMRSVRLQEAGGDI^RRGTSVLAQQSVRTQHLRDLQVIAAYR 
ERT KAES IAS LLSLAI TTEHTLHATLG VAEFFEFVLKNPHNTQH 
TVTVE I DNPELS VI VDS QEWRDFKGAAGLHT P VEEDMFHIjRGS L 
APQ LY LR PHE TAHVP F KPQS F3AG Q LAMVQAS PGLSNEKGMDAV 
SPWKSSAVPTKHAKVLFRASGGKPIAVLCLTVELQPHVVDQVFR 
FYHPELSFLKKAIRLPPWHTFPGAPVGMLGEDPPVHVRCSDPNV 
ICETQNVGPGEPRDIFLKVASGPSPElKDFFVI^YSDRWIiATPT 
QTWQ VYLH SLQRVDVS CVAGQLTRLSLVLRGTQTVRKVRAFTSH 
PQEL KTD P KGVF VLPP RG VQD I*HVGVRPLRAGS RFVHLNIjVDVD 
CHQLVAS WLVCLCCRQPL I S KAFE IMLAAGBGKGVNKRI TYTNP 
Y PSRRT FHLHSDHP EL LR FREDS PQVGGGETYTIGLQFAPS QRV 
GBBEILIYINDHEDKNBEAFCVKVIYQ 


7134 


2115 


1111 


GGEGFSYPPHVGLSLGTPLDPHYVLLEVHYDNPTYBEQLIDNSG" 
LRLFYTMDIRKYDAGVIEAGLWVSLFHTI PPGMPKFQSBOHCTL 
ECLEEALBABKPSGIHVFAVLLHAHLAGRGIRLRHFRKGKEMKL 
LAYDDDFDENFQEFQYLKEEQTILPGDISLITECRYNTKDRAEMT 
WGGLSTRSBMCLSYLLYYPRINLTRCASIPDIMBQLQFIGVKEI 
YRPVTTWPFIIKSPKQYKNLSF^AMNKFKJ^KKEGLSFNKLVL 
SLPVNVRCSKTDNAEWS IQGMTALPPDIERPYKAEPLVCGTSSS 
SSLHRDFS INLLVCLLLLSCTLSTKSL 


7135 


2 

'- 


2072 


FVPRVTPRSLSLQGPKGESVGSl^QPLPSSYLIFRAASESDGRC 
WIiDALELALRCSS LLRLGTCKPGRDGE PGTS PDAS PSSLCGLPA 
SATVHPDQDLFPLNGSSLENDAFSDKSERENPEESDTETQDHSR 
KTESGSDQSETPGAPVRRGTTYVEQVQEELGELGEASQVEIVSB 
ENKSLM WTLLKQLRPQMDLSRWLPTFVLBPRS FLNKLSDYYYH 
ADLLSRAAVEEDAYSRMKLVLRWYLSGFYKKPXGIKKPYNPILG 
ETFRCCWFHPQTDSRTFYIAEQVSHHPPVSAFHVSNRKDGFCrS 
GS ITAKS R FYGNSLSALLDG KATLTFLNRABDYTLTMP YAHCKG 
ILYGTMTLELGGKVTIECAKNNFQAQLEFKLKPFFGGSTS INQI 
SGKITSGEEVLASLSGHWDRDVFIKEEGSGSSALFWTPSGEVRR 
QRLRQHTVP LB EQTEIiESBRLWQHVTRAI S KGDQHRATQEKFAL 
EEAQRQRARERQE SLMPWKPQLFHLDP I TQE WHYRYEUHS PWD P 
LKDIAQTOQDGII»RTLQQEAVARQTTFLGSPGPRiIERSGPDQRL 
RXASDQPSGHSQATESSGSTPESCPELSDEEQDGDFVPGGESPC 
PRCRKEARRLQALHEAILS IRBAQQBLHRHLSAMLSSTARAAQA 
PTPGLLOS PRSWFLLCVFLACQLFINHILX 


7136 


2 


418 


DF VPS FRR P5GNTS QTVVn^LRAATLBI^ VAGLilE KIHHtiDDMLlfc 
S QQRKVRQMI EQLQNS KAV I QS KDATI QELKEKIA YLEAENLEM 
HDRMEHL I EKQISHGNFSTQARAXTENPGS IRI SKPPSPKPMPV 
IRWET 


7137 


2 


466" 


WASGMSTVPGGSRHS LGI QVRGG WG VTGGBEESLT V PVADT WQA 
GS FKVATQERNPQRAQMRLRRQKKG WP FLGDFLTBLQRLDSAI 
PDDLDGNTNKRS KEVRVIOEMQLLQVAAMNYRjLRPLEKFV^YFT 
RMEQLSDKESYKLSCQLEPENP 


7138 


2 


466 


WASGMSTVPGGSRHSLGIQVRGGWGVTGGEEESLTVPVADTWQA 
GS FKVATQERNPQRAQMRIiRRQ KKGWPFLGDFLTELQRLDS AI 
PDDLDGimjKRSKBVRVLQEMQLLQVAAMNYRLRPLEKFVTYFT 
RMEQLSDKESYKLSCQLEPENP 


7139 


1 


357 


SLRNSARGLKMAASAARGAAALRRSINQPVAFVRRIPWTAASSQ " 
LKEHFAQFGHVRRCILPFDKBTGFHRGLGWVQFSSEEGLRNALQ 
QENHI IDGVKVQVHTRRPKL PQTS DDE KKD F 


7140 


1401 


1957 


RASSLQVLKAWGGLIPSSFQQQHTGQYALEBI/FDLKVYDCFCSF 
NMNVSLEKQLRPSQPWPRGKCRKTPGWEEARPKAQDLRGDLGKT 
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: SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CsCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 

P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QAGPAEAHTRGPPRLPAATGCPPHLPGLLSGISVD1DPTGLQSQ 

WTPKGQDPPLMPSEDyQKSLLBQYHI^LDQKLRKyvVGELIWNP 
ADFMTWftfYl 


7141 


124 


1073 


LDSRSCWI^MEDLBEDVRFIVDETLDFGGLSPSDSREESDITVL 
VTPBKPLRRGLSHRSDPNAVAPAPQGVRIjSLGPLS PEKLEE I LiD 
EANRLAAQLEQCAIiQDRESAGEGLGPRRVKPSPRRETFVLKDSP 
VRDLLPTVNSLTRSTPS /LKQPDASTPE * * *EGVSQGS PGYI WK 
EALQHEEGVTHLQSVPCIQKPSIFSS\SRSTPPVRGRAGPSGRA 
AASEBTRAAKLRGAAAKSS CQLP I PS AI PRPASRMPLTS RS VPP 

GRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQRLNLPVM 
GATRSNLQPP 


7142 


65B 


839 


LI FLMLHMELKMLSS VTLHIRAFLYWICLKPTSCLIFQNVLNLL 
KK* SRAVGVWVMCRT/ YS SDLQVGVI KPWLLLGS QDAAHDLDT 
LKKNKVTHILNVAYG VENAFLSDFTYKS I S ILDLPETNILS YFP 
ECFEFIEEAKRKDGWLVHCNA 


7143 


3 


773 


SLEMSSDGEPLSRMDSEDSiSSTIMDVDSTISSGRSTPA>IMNGQ 
GSTTSSSKNIAYWCCWDQCQACFNSSPDLADHIRSIHVDGQRGG 
VFVCLWKGCKVYNTPSTSQSWLQRHMLTHSGDKPFKCVVGGCNA 
S FASQGGLARHVPTHFSQQNS S KVS SQP KAKEES PS KAGMNKRR 

KLKNKRRRSLARPHDFFDAQTLDAIRHRAICFNLSAHIESLGKG 
HSWFHSTVSIItLFFQIKYKTLQKNISTIISKSLKI 


7144 

1 - 


1 


988 


FRVNMQDGGPS PAEHS KAEE SAGMEARFLG LPDAAGSS GPTPAR 
RCPAPR PAG VS YVIRDBVEKYNRNG VNALQLDPALNRL FTAGRD 
S 1 1 RI WS VNQHKQD P Y IASMEHHTDWVND I VLCCNG KTLI SASS 
DTTVKVWNAHKGFCMSTLRTHKDYVKALAYAXDKELVASAGLDR 
QIFLWDVNTLTALTAStTNTVTTSSLSGNKDS I YSLAMNQLGTI I 
VSGS TEKVLRVWDPRTCAKLMKL KGHTDNVKALIiLNRBGTQCLS 

GSSDGTIRLWSLGQQRCIATYRVHDEGVWALQVNDAFTHVYSGG 
RDRKI YCTDLRNPDIRVLI CE [ 



TRADOCS: 14 16260. 1 (%CSK01 1.DOQ 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO:M786 and 3573-5358, a mature protein coding portion 
of SEQ ID NO:l-1786 and 3573-5358, an active domain of SEQ ID NO:l-1786 and 
3573-5358, and complementary sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim 1. 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 
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(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent 
conditions with any oneof SEQID NO: 1-1786 and 3573-5358. 

11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

• 1 3. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the . 
polynucleotide of claim 1 is detected. 

1 4. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising" at least a portion of the 
polynucleotide of claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 in 

the sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

1 6. A method for detecting the polypeptide of claim 1 0 in a sample, comprising: 
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a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound r that 
binds to the polypeptide of claim 10 is identified. 

1 9. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of a polynucleotide sequence of SEQ ID NO:l-1786 and 3573- 
5358, a mature protein coding portion of SEQ ID NO:l-1786 and 3573-5358, an active 
domain of SEQ ID NO: 1-1786 and 3573-5358, complementary sequences thereof and a 
polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO: 1-1 786 
and 3573-5358, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 
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20. An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides SEQ ID NO: 1787 -3572 and 5359-7144, 
the mature protein portion thereof, or the active domain thereof. 

21. The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO:l-1786 and 3573-5358. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of 
the polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer- 
readable format. 

27. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 1 0 or 20 
and apharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 10 or 20 and a pharmaceutical^ acceptable carrier. 
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Novel Vaccinia Vector Derived 
from the Host Range Restricted 
arid Highly Strain 
of Vaccinia Virus 

G. Sutter 1 , B. Moss* : : , . : 
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GSF-Centre for Environmental and Health Research, Oberschleissheim, Germany 

2 Laboratory of Viral Diseases , 
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Introduction 

Variola virus, the causative agent of the human smallpox disease, was eradi- 
cated through vaccination with live vaccinia virus. The unique success of the 
immunisation campaign was due in great part to characteristics that facilitated 
field use of the vaccine. Its useful features included: i) high immunogenicity; ii) 
heat stability; iii) ease of administration; and iv) low production costs; The appear- 
ance of molecular cloning techniques made it possible to recombine foreign genes 
into the genome of vaccinia virus and to produce functional recombinant proteins 
with vaccinia vectors [1-3]. 

The former live vaccine against smallpox quickly became a promising vehicle 
for the construction of recombinant vaccines. However, enthusiasm about vaccinia 
virus as a recombinant live vaccine has been lessened by safety concerns. During 
the smallpox eradication programme, adverse reactions were observed after 
immunisation with vaccinia virus, and vaccination of immunocompromised 
patients or individuals with skin conditions such as eczema was contra-indicated. 
Moreover, despite its low transmissibility, vaccinia virus is infectious for humans 
and a wide range of animals and therefore raises the possibility of spread of virus 
to non-vaccinated individuals or to the general environment. 

In view of these safety issues, the ideal vaccinia virus vector should be attenu- 
ated and restricted in its host range to protect both the health of the vaccinated 
individual and the non-target environment; it should also be sufficiently immuno- 
genic to fulfil its purpose as a vaccine. Several highly attenuated vaccinia virus 
strains were developed for use as particularly safe smallpox vaccines. Because of 
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its extreme attenuation, the potential of the modified vaccinia Ankara (MVA) 
strain as a viral vector seemed worthy of study. 



Development and Phenotypic characteristics 
of the Modified Vaccinia Ankara 

MVA was derived from the vaccinia viius strain Ankara, which was used in 
Turkey and Germany for vaccination against smallpox, by serial passage in chicken 
embryo fibroblasts. After over 500 passages, the resulting MVA virus differed by 
numerous biological markers from the known vaccinia virus strains [4], and it had 
lost its capacity to replicate productively in cells of human dr other mammalian 

origin. . -fcyTT/A 

When examined in various animal species from rodents to primates, MVA was 
found to be avirulent even in immunosuppressed animals. The absence of local 
reactions upon intradermal or subcutaneous injection of high virus doses markedly 
distinguished MVA from other vaccinia virus strains. More importantly, MV A was 
used for the primary vaccination of over 120,000 humans against smallpox [5J. The 
vaccine was administered intracutaneous^, subcutaneously or intramuscularly 
and, in humans as in other animals, no pock lesions developed at the site of inocu- 
lation. During these extensive clinical trials, which included, for example, high-risk 
individuals with skin lesions, no adverse reactions were associated with the use of 
MVA vaccine. .. , . . 

Recent comparisons of the genomes of MVA and the ancestral vaccinia 
Ankara strain identified six major deletions in the MVA genome. The loss of 
genetic information totalled over 30,000 base pairs of DNA, including at least two 
vaccinia virus host range genes, which resulted in the extreme restriction of host 
range of MVA [6].,\: 

Non-replicating MVA Vector Efficiently Expresses 
Recombinant Genes 

As MVA cannot grow productively in human and most other mammalian cells, 
its usefulness as an expression vector was evaluated. High expression of recombi- 
nant genes seemed unlikely, given reports indicating that most other host-range 
vaccinia virus mutants are inhibited very early in infection. However, when human, 
ceils were infected with MVA, viral replication was blocked late in infection, thus, 
preventing the assembly to mature infectious virions (Fig. 1). An important conse- 
quence of this late-stage defect is that MVA is able to express viral and recombi^; 
nant genes at high levels even in non-permissive cells. ■ ■ • ^ 

Because using MVA to produce recombinant proteins in mammalian tissue cul- 
ture should drastically reduce risks of infection for laboratory workers, MVA. was 
proposed as an exceptionally safe and efficient expression vector [7]. Plasmid 
transfer vectors that provide flanking DNA sequences for homologous recombina-i 
tion with the MVA genome were constructed. Foreign genes, under the control of 
strong synthetic vaccinia virus earlyAate promoters, are precisely targeted to the 
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site of a naturally existing deletion in the MVA genome; transient or stable co- 
expression of a marker gene for antibiotic selection facilitates the isolation of 
recombinant viruses (Fig. 2). The overall strategy in designing the vector plasmids 
was to avoid unnecessary changes in the genotype and phenotype of resulting 
recombinant MVA viruses. 




> : . • Fig. 1: Virion morphogenesis in human cells 



A: Exclusively immature particles found in MVA-infected HeLa cells. 

B: Mature brick-shaped particles with complex internal structures in HeLa cells infected with 
wild-type vaccinia virus. 
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Fig. 2: Schematic map of the genome of MVA and expression plasmid designed for the insertion of 
foreign genes by homologous recombination 



The plasmid pm gptex HA-NP was used to construct MVA recombinants expressing influenza 
virus haemag^utinin (HA) and nucleoprotein (NP) genes. 
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Evaluation as Recombinant Vaccine 

The immunogenicity of recombinant jgene products delivered by viral vectors 
is particularly important for successful vaccine applications. To ascertain whether 
MVA vectors would be useful for immunisation experiments with recombinant ' 
proteins, a recombinant MVA. was constructed and tested that simultaneously/, 
expresses the haemagglutinin (HA), and nuclebprotein (NP) genes of influenza: 
virus (Fig. 2) [8]. The HA-'and NP genes were chosen because of the exteisiv^; 
immunological information available about the proteins they encode. The fia|4 
magglutinin, most importantly, stimidates vinis-neutralismg antibodies, whereas; 
the nucleoprotein is a predominant target for cytotoxic T-cell response. In ac- 
tion, a weU-established.infliieiiza virus mouse challenge model made it pqsSAl&fe \ 
obtain an accurate evaluation of the recombinant : MVA IM-^ in ^;pi^tectwn|\ 
studies, "v.: 1 ! ■■ , ... ^;;v:^ N ' 5 ';>'-" "''^ l^-- ^ /v^'/^y^S 

When MVA HA-NP was used to infect tissue cultures, high amounts of recom- 
binant HA and NP were made m permissive chicken embryo fibroblasts, as well as; 
in non-permissive human and ; mouse cell lines. Mouse cell monolayers infecte<E 
with low doses of MVA HArNP showed no cytopathic effect, and inmiuhostaining' 
with antibody to HA revealed that only individual cells produced the recombinant 
protein (Fig. 3). By contrast^ a massive cytopathic effect occurred when mouse . 
cells were infected with a replication-competent vaccinia virus, Western Reserve 
recombinant (WR HA-NP); : 





| B 










r 







Fig. 3: Immunostaining of mouse cell monolayers infected for two days 



A: Infected with MVA HA-NP 

B: Infected with WR HA-NP . -V ; 

The arrows indicate three of the multiple single cells synthesizing haemagglutinin after infection! 
with MVA HA-NP. * [ 



The recombinant MVA was found to be immunogenic, since mice inoculated^ 
by various routes produced high levels of circulating antibodies that inhibited 
haemagglutination by influenza virus. Vaccination with MVA HA-NP also primed 
for a strong cytotoxic T-cell response directed to the influenza virus proteins; 
Anim als inununised once with relatively low doses of recombinant MVA were 
completely protected against lethal respiratory tract challenge with influenza virus; 
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Significantly, the vaccine doses of MVA HA-NP needed to accomplish a fully pro- 
tective humoral and/or cellular immune response were similar to those required 
for the replication-competent WR HA-NP (Table 1). 



Table 1: Protection of mice inoculated with MVA HA-NP from lethal challenge with influenza 
virus A/PR/8 
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TCID50 /animal inoculated by the intramuscular route. r 

Surviving animals/total animals challenged in each group: 100 LD a Influenza A/PR/8 
challenge delivered to 10-week-old-mice, 4 weeks post vaccination. 



These studies have shown that the MVA strain of vaccinia virus is a highly 
attenuated and non-replicating vaccinia virus vector. The features of MVA that 
make it so attractive as a vector system include: the high titres achieved in chicken 
embryo fibroblasts; the extremely restricted host range; the avirulence in animals 
even under immunosuppressive conditions; the extensive safety testing in humans; 
the late-stage block in non-permissive cells, allowing high-level recombinant gene 
expression; and high immunogenicity as a recombinant vaccine. One might specu- 
late whether the surprising immunogenicity of recombinant antigens delivered by 
highly attenuated non-replicating poxvirus vectors is related to the absence of 
virus particle formation or to the loss of viral factors that counteract the function 
of the immune system. Whatever the cause, the adoption of MVA or other non- 
replicating poxvirus vectors for the development of live recombinant vaccines 
should greatly reduce the potential hazard to individuals and prevent transmission 
to non-target species. 
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Highly Attenuated Poxvirus Vectors: 
NYVAC, ALVAC and TROVAC 
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2 Pasteur M§rieux Scrums et Vaccins, Marcy I'Etoile, France 

Introduction 

More than a decade ago, the concept of recombinant poxviruses was intro- 
duced to the scientific and medical community. It was clear from the beginning 
that recombinant poxviruses would provide a powerful vaccination strategy for 
heterologous pathogens [1]. It was also clear that before these vector systems 
would be accepted for general use, safety concerns would have to be addressed. 
Significant effort expended in both basic research and preclinical development to 
address the safety issues attendant on the use of poxvims-based vectors has met 
with good success. Today, three highly attenuated yet highly efficacious poxvirus 
vectors are available for use in both veterinary and human medicine [2,3], The 
three vectors, NYVAC, ALVAC and TROVAC, are briefly described in the fol- 
lowing report. 

The NYVAC Vector 

Analysis of existing vaccinia vaccine strains, with an eye to the balance between 
immunogenicity and attenuated phenotype, resulted in the selection of the Copen- 
hagen strain for further development A plaque-cloned isolate provided the starting 
material. The entire DNA sequence was derived [4]. This provided a well-defined 
genome for future genotypic alterations. Furthermore, since the genetic organisa- 
tion was known, genetic fiinctions associated with virulence, tissue tropism and host 
range could be targeted. Relevant genes could be precisely deleted from the vector, 
thus providing a highly attenuated phenotype. The complete deletion of the open 
reading frames specifying these adverse properties would preclude subsequent 
reversion to wild-type. Eighteen open reading frames were targeted and precisely 
deleted from the Copenhagen parent to generate the NYVAC vector [5]. 

To assess the attenuation phenotype of the NYVAC vector, studies were per- 
formed in various animal models. Comparison with other orthopoxviruses, includ- 
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ing the WR strain of vaccinia virus, the parental plaque-cloned isolate of the 
Copenhagen strain, and the Wyeth vaccine strain, provided rigorous criteria for 
assessing the attenuation characteristics of the NYVAC vector. Comparison with 
the Wyeth vaccine strain is particularly important, since this was the major strain 
used in the United States during the smallpox eradication programme and was 
considered by many to have demonstrated the best safety profile. 

The resultant NYVAC strain was shown to be highly attenuated according to 
the following criteria: 

- little or no induration upon intradermal inoculation on the shaved rabbit skin; 

- rapid loss of infectious virus from the inoculation site; 

- absence of tropism for ovaries or testes; 

- little or no recovery of infectious virus from lung, spleen and liver; 

- lethality by intracranial inoculation into newborn mice reduced a million times 
by comparison with parental or Wyeth vaccine strain; 

- no evidence of disseminated vaccinial infection in the immunocompromised 
host (either homozygous nude or cyclophosphamide-suppressed mice); 

- reduced replication competency in cells derived from human and other species 
[5; unpublished observations]. 

Nevertheless, despite this highly attenuated profile the NYVAC Vector retains 
an immunogenic potential equivalent to that of the TK* Copenhagen parent [5; 
unpublished observations]. 

Significant attention has been given to the consequences of vaccination in the 
immunocompromised or immunodeficient host [5; unpublished observations] 1 . 
Some of these results are presented in Tables 1 and 2. 

r * ■ ; , f • 

Table 1: Lack of disseminated infection of NYVAC in cyclophosphamide-treated mice 



- Intraperitoneal inoculation of WR strain [2.15 x 10 2 to 2.15 x 10 4 plaque-forming units 
(pfu)] resulted in lesions at distant sites between 12 to 16 days post-inoculation. 

- Intraperitoneal inoculation of Wyeth vaccine strain (9.5 x 10 4 pfu) resulted in lesions 
at distant sites between 7 to 15 days post-inoculation. 

- Intraperitoneal inoculation of Copenhagen parent (1.65 x 10 s to 1.65 x 10 9 ) resulted in 
lesions at distant sites between 7 to 12 days post-inoculation. 

- Intraperitoneal inoculation of NYVAC did not result in lesions at distant sites over the . 
100-day observation period even at the highest doses. 



Recently, in recognition of the biological properties and highly attenuated phe-. 
notypic profile, the Recombinant DNA Advisory Committee of the National Instir 
tutes of Health (NIH) has reduced the biological safety level from BSL2 to BSL1 
[6]. This is the lowest level for biosafety containment. The NYVAC vector is the 
only orthopoxvirus to be accorded such a classification. 



Table 2: Lack of disseminated infection of NYVAC in homozygous nude mice 



Intraperitoneal inoculation of WR strain resulted in lesions at distant sites (10 3 and 10 4 
pfu on days 17 and 34 post-inoculation, respectively). 

Intraperitoneal inoculation of Wyeth vaccine strain (5 x 10 7 or 5 x 10 8 pfu) resulted in 
lesions at distant sites, orchitis, and death. 

Intraperitoneal inoculation of Copenhagen parent (10 4 to 10 9 pfu) resulted in lesions 
at distant sites (between 11 and 21 days post-inoculation) and orchitis (between 23 and 
30 days post-infection). 

Intraperitoneal inoculation with NYVAC resulted in no signs of disease throughout 
the 100-day observation period, even at the highest doses. 



The ALVAC Vector 

In contrast to vaccinia and other members of the Orthopoxvirus genus, which 
have a very broad vertebrate host range, the viruses included within the Avipoxvi- 
rus genus are restricted for productive replication to avian species [7]. Advantage 
has been taken of this natural attenuation characteristic in establishing avipoxvi- 
ruses as useful vectors in non-avian species [2]. 

Avipox recombinants expressing genes from m ammali an pathogens have been 
constructed. Inoculation of these avipox vectors into tissue culture cells of 
non-avian origin results in the expression of the foreign gene product at levels and 
processing proficiency similar to those of vaccinia virus recombinants. In contrast 
to vaccinia recombinants, the replication cycle of the avipox virus vectors in mam- 
malian cells is abortive, and no progeny virus is formed. In cells of human origin, 
the avipox replication cycle is abortive in the early phase, before viral DNA r repli- 
cation. Attempts to adapt the avipoxvirus vectors for replication on mammalian 
cells have been unsuccessful [5; unpublished observations]. In spite of the abortive 
replication of avipoxvirus vectors in mammalian cells, inoculation of these recom- 
binants into animals demonstrated that the expression of the foreign immunogen 
was sufficient to induce a protective immune response [8], 

Comparative studies with fowlpox and canarypox vectors expressing the 
rabies glycoprotein demonstrated the second to be much more efficacious [8]. 
Hence, the canarypox vector was chosen for further development. ALVAC is the 
name given to a plaque-cloned isolate of a highly attenuated strain of canarypox, 
Kanapox [5], In France, Kanapox is a licensed veterinary vaccine used extensively 
by breeders of canaries. Inoculation of the ALVAC vector intracraniaily into 
newborn mice confirmed its highly attenuated phenotype (10 million times more 
attenuated than the Wyeth vaccine strain), consistent with the inability of the 
virus to grow in mammalian cells. Inoculation of ALVAC into immunocomprom- 
ised or immunodeficient laboratory alnimals did not indicate any significant viru- 
lence [5]. Inoculation of the ALVAC vector into susceptible canary birds demon- 
strated measurable replication [our unpublished observations]. In contrast, 
intradermal inoculation into mice demonstrated clearance of the inoculum virus 
without any evidence of viral replication. Inoculation of birds, did not result in 
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spread of the'vector to contacts, as measured by seroconversion or subsequent 
protection of contacts to pathogen challenge [our unpublished observations]. The 
ALVAC vector has been shown to be safe and efficacious in many species, includ^ 
ing cats, dogs, and horses [8-10; unpublished observations]. ALVAC recombi-: : 
nants have also been shown to be safe and well tolerated when administered tcfc 
human volunteers, including HTV seropositives [11-13]. 

Because of its biological properties and attenuation phenotype, the ALVACK 
vector has a biosafety level 1 (BSL1) classification in the United States [6]. 
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The TROVAC Vector 

As with the canarypox ALVAC vector, the fowlpox-based vector has beeii 
developed for use in mammalian species. However, since the TROVAC vector is; 
based on a highly attenuated fowlpox vaccine strain virus useful for immunising 
day-old chicks against fowlpox disease, an obvious application is the construction of 
recombinant vaccines for use in the poultry industry. In this regard, fowlpox-based ; 
recombinants have been shown to be very safe and effective for vaccinating day-old 
chicks. Effective vaccination using recombinant fowlpox-based vectors against chal-; 
lenge with avian influenza virus and Newcastle disease virus has been reported 
[9, 14]. In day-old birds, vaccination with these fowlpox-based recombinants is well 
tolerated. Induction of immune response is documented, and protection against vir- 
ulent challenge is well established. Significandy, no measurable shedding of the vac^ 
cine virus to un vaccinated contact controls was observed, since no seroconversion 
and no protection against challenge virus was observed. Historically, fowlpox vacv 
cines have been administered openly to large numbers of poultry flocks, for exam- 
ple via drinking water [15]. The scientific literature records no adverse effects 
involving either environmental contamination or infection of poultry handlers [16]; : 

Because of the highly attenuated phenotype of the TROVAC vector and the 
significant accumulation of preclinical safety data, the biosafety level containment 
for the TROVAC vector is now established at BSL1 [6]. 
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Summary 



Three highly attenuated and efficacious poxvirus-based vectors, NYVAC, 
ALVAC and TROVAC, are available for targeted applications as recombinant 
vaccines in both human and veterinary medicine. The attenuated phenotype of the . 
three vectors is consistent with safe use for vaccination purposes, for the vaccinee;- 
for unvaccinated contacts, and for introduction into the environment. The precise 
deletion of virulence and host range genes in the NYVAC vector precludes rever- 
sion to the virulent phenotype by back mutation. Dissemination of recombinant 
vaccines based on the NYVAC, ALVAC and TROVAC vectors is highly dimin- 
ished, because of the genetic engineering in NYVAC and the natural attenuated 
phenotype of ALVAC and TROVAC. Studies have demonstrated that these 
recombinant vectors are genetically and phenotypically stable after serial passage 
in vitro as well as in vivo. 

NYVAC, ALVAC and TROVAC vectors are the only three poxvirus-based 
vectors that are classified as BSL1 agents. 
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Replicating and Host-Restricted 
Non-Replicating Vaccinia Virus Vectors 
for Vaccine Development 

0. Moss 

Laboratory of Viral Diseases, National Institute of Allergy and Infectious Diseases, 
National Institutes of Health, Bethesda MD, USA 

Key words: Poxvirus, vaccinia virus, expression vector. 

Abstract Vaccinia virus, effectively used for immunization against smallpox, now serves as a 
recombinant vector for the development of vaccines against other infectious agents and cancer. 
Recent advances have led to increased gene expression, enhanced immunogenicity, and greater 
safety. Both replicating and host-restricted non-replicating vectors are available. 

Introduction 

Vaccinia virus is the best characterized member of the poxvirus family. 
Although the origin of vaccinia virus is still uncertain, its role in the prevention of 
smallpox is well documented [1]. The success of vaccination can be 'attributed to: 
the close immunological relationship between vaccinia virus and variola virus, the 
causative agent of smallpox; the long duration of immunity; ease of vaccine pro- 
duction; the heat stability of the dried vaccine; and simple and effective methods of 
administration. These features, together with the absence of animal reservoirs of 
variola virus and the distinctive characteristics of the disease, contributed to the 
eradication of smallpox. Nevertheless, the usefulness of vaccinia virus has not 
ended. Recombinant forms of vaccinia virus, that express heterologous genetic 
material, have been developed for immunization against other diseases [2, 3]. 

An understanding of the molecular biology of vaccinia virus is crucial for the 
optimal development of vectors. Like all poxviruses, vaccinia virus has a large 
double-stranded DNA genome that is packaged, along with enzymes used for 
mRNA biosynthesis, in a complex lipoprotein-enveloped particle [4]. Vaccinia 
virus is capable of productively infecting cells from many different mammalian and 
avian sources. Replication takes place entirely within the cytoplasm and the viral 
genes are expressed in a precisely programmed fashion [5]. There are three classes 
of viral genes - early, intermediate and late - with each class having distinctive 
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™™nter sentiences and cognate transcription factors. Early mRNAs are synthe- 
afthe e^es^d template within the infecting virion whereas newty syn- 
tSted viral Symes and DNA are needed for synthesis of intermediate and late 

DM A r. 



mRNAs. 



Insertion of foreign DNA 

Poxvirus DNA by itself, is not infectious since viral" enzymes are needed to id- 

DNA that cTbe added to the vaccinia virus genome by this procedure exceeds 
^Che" alternative method, the viral genomic DNA is ait at a 

potiS StenTansfected with tie ligated DNA. This PJJ^-"^ 
useful for introducing large DNA fragments and for avoidmg intermediate cloning 
in bacteria [6]. 

Poxvirus promoters . 

Since expression of a foreign gene is dependent on V°™™ 0 l™?^ f 

« y ^°i2S5S«?pS| and late [11] promoters. The ^ghest «£f 
binan protein have been obtained with strong natural [12, 13] "^^fJJifS 
oromoters The latter can result in over a 40-fold increase in express on relative to 
Sat of Se' widely used F7.5 promoter. Nevertheless, as discussed below, it may be 
adv^orteuse early or tandem early-late promoters for induction of cell- 
mediated immune responses. 

Foreign genes 

rnnrinunm ooen reading frames, derived either from foreign genes without 
intrSns %NA co^es of mRNAs, or by in vitro synthesis, are required as vaccinia 
3S^A. -SSot spliced [15]. A TTTTTNT sequence in 
at the end of early poxvirus genes signal transcriptional termination downstream 

wmmmtmm^mma^ mm m tmmm mm ^ mmm mm mmm m mmm mm mm BB JJ^ 
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[16]. Such sequences, when present within the body of foreign genes to be regula- 
ted by early promoters, may diminish expression and should be altered without 
changing coding properties [17]. 

Selection of recombinant viruses 

Under the usual transfection conditions, approximately one of a thousand pro- 
geny virus particles is a recombinant. The relatively high recombination frequency 
makes it possible to screen virus plaques by DNA hybridization or expression of 
gene products [18]. Selection or general screening procedures, however, can speed 
up the process. These include: selection for the thymidine kinase (TK) negative 
phenotype in TK deficient cells [19], antibiotic selection [20-23] and screening for 
(J-galactosidase [24, 25], host range [26] or plaque size [27]. Detailed laboratory 
protocols for making recombinant viruses and a list of convenient transfer vectors 
have been published [28, 29]. 

Immunogenicity 

Recombinant vaccinia viruses induce both humoral [30-32] and cellular [33-35] 
immune responses. Since the proteins undergo normal postradiational modifica- 
tions and intracellular trafficking, they are presented in their native configuration. 
This is an important feature, as viral neutralizing antibodies are frequently direct- 
ed to conformational epitopes of surface glycoproteins. In some cases, modifying 
secreted or intracellular proteins so that they were presented in the plasma mem- 
brane, greatly increased the antibody response [36-39]. In numerous examples, 
protection induced by recombinant vaccinia viruses was correlated with neutrali- 
zing antibody against viral envelope proteins [40, 41]. 

The induction of a strong class I restricted cytotoxic t cell (CTL) response pro- 
vides a major advantage of infectious recombinant viruses compared to inactivated 
or subunit vaccines [33, 34]. In some animal models the CTL response provided 
protection against a virus challenge [42-45], 

Antigen presentation, by class I molecules, may be decreased late in infec- 
tion [46, 47]; consequently, gene regulation by early (or tandem early/late) pro- 
moters is recommended. Consistent with present concepts of presentation, 
expression of minigenes encoding short peptides is sufficient to induce CTL 
responses [48-51]. 

Site of inoculation 

The type of immune response may depend on the site of inoculation. Thus, bet- 
ter protection against upper respiratory infections occurred after nasal inocula- 
tions with recombinant vaccinia viruses that express influenza and respiratory syn- 
cytial virus envelope glycoproteins compared to intradermal inoculations, 
although both routes protected against lower respiratory infection [52, 53]. Muco- 
sal immunity may also be induced by intestinal inoculation with recombinant vac- 
cinia vims [54]. 



Vaccinia Virus Vectors 



57 



Attenuated vaccinia viruses 

The significant adverse r^%^Z^^££^ 
[1,55] has prompted the development of^yatt e "Po™ adap tation of 
nant vaccines. Four general ■Pi^^..^^ 1 3ffHS fa i« culture, 
existing vaccinia viruses that were.attenuated by (3) 

^Tisssffis t ssrs sr- va^a ^ ^ (4) 

ed vaccine strains [56-59]. Dunng "ore 5 ? 0^ passagejm on ^ ^ 
blasts, MVA became host restricted and ^nabl^ ^ES£«-ed animals a„ d 
mammalian cells. MVA is avmdent in ^^^S of 120,000 humans, 
was shown to have no ^^^^coSonS smallpox vaccine strains, 
many of whom were at high risk for fl*e conv ^ 
Genetic analyses indicated that more *an JU,uut p g^g^y, rep li- 
least two host range genes, had been deleted toMVJ i auj » 

cation of MVA in ^"P^t^Z^S^ JS r SS restricted 
bly [61], rather than at an early stage as ^curs wtn o ™£ MVA vectors 

poxviruses. Consequently even dunng nonproduct JJJjfcJ"^ of wild type 
produce recombinant proteins m amounte that are sum m mice is 

^S d ^r t Si^vSf^s tha, provided protege 
immunity in test animals [64]. altered bv specific gene deletions. 

as t K [65], ^^x^sctM"!^*^ 

protein [68], ribonucleotide reductase ™™°f _£ h()s , ' , g „es [75], 
dehydrogenase [73]; complement control P™'«>° W rfjSfual and multiple 

appears to be particularly faununogemc [78, W]. - offers 

PP The insertion of lymphokine genes into .^XSelTSecting immuno- 
still another method of ^^.'^^SS^S^m or human [67] 
genicity, Recombinant v«J™ *J San wild-type virus for 

IL-2 or interferon-Y [81, 82J were mucn i«* F*" & expressing virus 

virus [85, 86]. 
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Future developments 



Recombinant vaccinia viruses have proven their value in determining the tar- 
gets and types of immune responses needed for protection against infectious 
agents in experimental animals. The licensing of some recombinant vaccinia virus 
vaccines for veterinary use seems likely. Of special note is a vaccinia virus vector 
that expresses the glycoprotein gene of rabies virus for wild-life vaccination [87-89] 
and a recombinant vaccinia virus that expresses the envelope glycoprotein genes of 
rinderpest, an endemic disease of cattle in Africa and India [90]. The safety and 
immunogenicity of a first generation recombinant vaccinia virus HIV vaccine has 
been demonstrated in phase 1 human testing [91, 92], The latter vaccine might be 
improved by increasing gene expression, using an envelope gene from a more pre- 
valent isolate, and a more attenuated vector. 
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UNIVERSITY OF ROCHESTER v. G.D. SEARLE & CO: 
WRITING"^ 



I. Introduction 



The first paragraph of 35 U.S.C. § 112 is currently interpreted to include a 
written description requirement that is distinct from the requirement of enabling a 
person skilled in the art to make and use an invention .3 Opinions differ as to 
whether a distinction between these requirements has always existed or whether it is 
a recent development The Patent Acts of 1790 and 1793 required that the patent 
specification distinguish the invention from other known things ^nd enable one 
skilled in the art to practice the invention. With the Patent Acts of 1836 and 1870, 
enablement became the sole measure of satisfying the description requirement. 
From the early nineteenth century, courts relied on knowledge of one skilled in the 
art to identify a "principle" or "mode of operation" to distinguish an invention and to 
establish the limit of an inventor's exclusive right over alleged infringers. During 
this time, the understanding of one skilled in the art as applied to patentability and 
infringement was closely linked to "interchangeability" and "equivalence" in view of 
the written description of a patent specification. Subsequent to enactment of the 
Patent Act of 1952, courts began to distinguish between a statutory requirement of a 
description of the invention and a statutory requirement of enablement by one skilled 
in the art to make and use it, the former necessitating sufficient demonstration to 
one skilled in the art that the inventor "possessed" or "invented" the invention. 
Analysis of a specification under the "written description requirement" relied on a 
determination of whether one skilled in the art would comprehend the scope of the 
claimed invention in view of the description provided, in effect employing enablement 
as the statutory threshold for description purposes. The Court of Appeals for the 
Federal Circui t ("CAFC) in University of Rochester v, G.D. Searle & Co. [hereinafter 
Bochester IJ\ departs from considerations of enablement to assess adequacy of the 
written description in a patent specification. Further, opinions issued in a decision 
denying a petition to rehear this case portend a fundamental split in authority jn^the 
T CAFC that, depending on the panel, will severely .limit or deny patent protection 



Formatted: Font: 12 pt 



Formatted: Font: Italic 



Deleted: 2 



Deleted: 



Deleted: 



Formatted: Font: Century, 8.5 pt ) 



• { Deleted: before known 



{ Deleted: at 



II. Summary ov JJN^i^nrgFRg^ 

NO PROTOCTiON FOR THE ^TEQ^SOPHER'S STONE" 



- { Deleted: Federal Circuit 
- { Formatted: Font: Italic 



<> See In Barker. 559 F.2d 588. 593 (C.C.P.A. 1977) ("5 USC 112. first paragraph, contains 
separate requirements for a written description Til of the invent ion, and [2] of the manner and 
process of making and using it. in such full, clear, concise, and exact terma as to enable anv person 
skilled in the ait . . . to ma ke and use the same . . . r ); see ako In re Gardner, 475 F.2d 1369. 1 301 
(C.C.P.A 1973) (During the prosecutio n of the patent, the patent examiner determines "whether the 
separate hut related description and how-to-use requirements of the first paragraph of 35 USC 1 12 
have been satisfied."). 
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jCAj^C in ^Rocheste r^ IB affirmed a decision by the United States District 
Court for "the "We stera District of New York 6 invaljudxn&.U,S..Pat^^ 
(hereinafter ' the '850 patent]) for fai lure to pro vide-a— written d escription-of-thr ' 
claimed invention as recfuired-by-the-first-paragraph-o 



rfcb2;jnrUnited^ 



States Code Title 35. 6 The claims w ere dir ec ted ti ^a' iKe^d rfor selectively inMbitii^ y ;&\\ 
mammalian prostaglandi n H synthase-2 (FGHS-2, or CQX-2) in a human host 6 y^!|\\ 
administering a non-steroidal C Qmpound that selectively inhibits activity of a^ V^V* 
PGHS-2 gene product. 7 The invention was based on a discovery by scientists at the \j&V* 
University of Rochester of the existence of PGHS-2 --and -its- - association- with---, }^ 
inflammatory stimuli res pons ible for pain and inflammation, and^haTlPGHS^has - ^ 
functions distinct from PGHS-1, which provides benefits such as assistance in~\ ] 
protecting the stomach lkungJLJKjip_wji4iain relievers employftd to inhihit_flcjjyitY_Qf^ 
PGHS-2 also inhibited PGHS'l, potentially causing stomach irritation, ---Having^' 
identified the distlnct^functions— between— P 



Formatted: Block Text (FN), Left, 
Indent: First line: 0", Line spadng: 
single 



Deleted: .") 



Formatted: Indent: First line: 0", 
Right: 0.3" 



Deleted: Court of Appeals for the 
Federal Circuit 



Deleted: University of .. 
Searle & Co... .section 



v. G.D. 



* Umv of Rochester* GJ1 ^ fin , amtPaHQift cw znnd) r»rt rleni^ i?n 



S. Ct, 629 (NovJ29_2Q 

e Uniy^of Rochester v. G.D. Searle & Co., 249 F..Supp._2d 216. 236 (WJ^NX 2003). a/7W358 
F.3d 916 (Fed. Cir. 200 ) [hereinafter Rochester ft . 

6 35 U.S.C. § 112 (2000). The pertinent section states'- * 

The specification shall contain a written description of the invention, and of the 
manner and process of making and using it, in such full, cleariLCOJici&e,-and.( 
terms as to e rtahle any pers on skilled in th e art to which it pertains , or with which 
_ iL is .most nearly connected, Jto make .and nsft Jthe. same, _and_shall set .forth the jbest . 
mode contemplated by the inventor of carrying out his invention. 

Id 

7 The independent claims 1, 5 and 6 of the '850 patent are as follows: 

1. A method for selectively inhibiting PGHS:2 activity in a human host, 
comprising administering a non-steroidal compound that selectively inhibits 
activity of the PGHS:2 gene product to a human host in need of such treatment. 

5. A method for selectively inhibiting PGHS:2 activity in a human host, 
comprising administering a non-steroidal compound that selectively inhibits 
activity of the PGHS;2 gene product in a human host in need of such treatment, 
wherein the activity of the non-steroidal compound does not result in significant 
toxic side effects in the human host. 

6. A method for selectively inhibiting PGHS-2 iBctivity in a human host, 
comprising administering a non-steroidal compound that selectively inhibits 
activity of the PGHS-2 gene product in a human host in need of such treatment, 
wherein the ability of the non-steroidal compound to selectively inhibit the 
activity of the PGHS-2 gene product is determined by: 

a) contacting a genetically engineered cell that expresses human PGHS-2, and not 
human PGHS-1, with the compound for 30 minutes, and exposing the cell to a 
pre-determined- amount of arachidonic acid; 

b) contacting a genetically engineered cell that expresses human PGHS-1, and not 
human PGHS-2, with the compound for 30 minutes, and exposing the cell to a 
pre-determined amount of arachidonic acid; 

c) measuring the conversion of arachidonic acid to its prostaglandin metabolite; 
and 

d) comparing the amount of the converted arachidonic acid converted by each cell exposed 
to the compound to the amount of the arachidonic acid converted by control cells that were 
not exposed to the compound, so that the compounds that inhibit PGHS-2 and not PGHS-1 
activity are identified.U.S. Patent No. 6.048.850 (issued April 11. 2000) . 

8 Rochester J, 249 F._Supp..2d at 219. 
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developed an assay for identifying a non-steroidal compound that would selectively 
inhibit PGHS-2. 9 The '850 patent described that assay, but did not include any 
compounds identified by its use. 10 
The district court stated that: 




TT]fce"f£arissue-te^ C 

me thc^d of ta^tment is adequate "where a [romjjouniT ffiat Ts[ neressary " to ] \ " " *;/ 

practice that method is described only in terms of its r\inction,.and..W-here \ 

the only means provided for finding such a compound is essentially a *J 
trial-and* error process. 11 • 

Comparing the non-steroidal compound of the '850 patent claims to the 
"philosopher^, .stone,". .the„co.urt . found, .that,, .without . .the. compound,, .the. .patentees*, 
could notmave -posse ssion-of the -claimed method of its- use ^--The -court-stated- tha t r -to- vj, 
be an inventor or patentee, one must demonstrate possession of the invention, 13 and £ 
that failure "te pfovide such ' a description re"^ 

plan" 14 ^hich ,was "an attempt to jpreempt the future before it has _ arrive id." 15 The 
district court held that the written description of the specification, as a matter of law, 
failed to comply with the written description requirement of 35 U.S.C. § 112 and 
granted defendants' motion for summary judgment because no compound was 
identified by the assay. 16 The court also granted a motion for summary judgment of 

»Id 

io Id. at 224. 
"Id. at 221. 

12 Id. at 230, The court noted: 

In effect, then, the '850 patent claims a method that cannot be practiced until one 
discovers a_compound that was not in possession o£ or known to, the inventors 
themselves. Putting the claimed method into practice awaited someone actually 
discovering a necessary component of the invention. In some ways, this is 
reminiscent of the search for the soicalled ^philosopher's stone,l_eagerly sought 
after by medieval alchemists, which supposedly would transmute lead into gold. 
While the Court does not mean to suggest that the inventors' significant work in 
this field is on a par with alchemy, the fact remains that without the compound 
called for in the patent, the inventors could no more be said to have possessed the 
complete invention claimed by the '850 patent than the alchemists possessed a ? 
method of turning base metals into gold. 
Jd at 229--3Q, 

is Id. at 218 . The court stated that: 

An "inventor! or patentee is entitled to a patent to protect his work but only if he 
produces or has possession of something truly new and novel. The 'invention"' he 
claims must be sufficiently concrete so that it can be described for the world to 
appreciate the specific nature of the work that Bets it apart from what was before. 
The inventor must be able to describe the item to be patented with such clarity 
that the reader is assured that the inventor actually has possession and 
knowledge of the unique composition that makes it worthy of patent protection. 

Id 

14 Id. ("The patent here does not do that. What the reader learns from this patent is a wish or 
plan or first step for obtaining a desired result."). 

16 Id. ("Claiming all DNA's that achieve a result without denning what means will do so is not 
in compliance with the description requirement; it is an attempt to preempt the future before it has 
arrived.") {ouotineFiers v. Revel. 984 F.2d lKi-l. 1171 (Fed. Cir. 1993)) . 

»« Id at 224, 



:229 



Deleted: <■ 



Formatted: Block Text (FN), Left, 
Indent: First line: 0", Line spadng: 
single 



Deleted: 



Deleted:' 



Formatted: Indent: Left: 0", 
Hanging: 0.3", Right: 0.3" 



Deleted: 



Deleted: 



Deleted: " (emphasis in original)). ) 
Deleted: ( ] 



Formatted: Block Text (FN), Left, 
Indent: First line: 0", Line spadng: 
single 



\ { Deleted: 



1 Deleted: 



Deleted: 




Formatted: Indent: Left: 0", 
Hanging: 0.3", Right: 0.3" 



Deleted: at_"). 



Deleted: 



Deleted: (quoting Fiers v. Revel, 
984F.2d 1164, 1171 (Fed. Cir. 1993) 



Inserted: ( 



Deleted:) 



Formatted: Font: Italic 



Deleted:) 



[1:100 2002] 



University of Rochester v. G.D. Searle & Co. 



5 



patent invalidity for non-enablement on the basis that a person of ordinary skill in*, 
the art " would" " Have " to " engage in " un3ue"e^ermentaH6n f " * without assurance of*, 
success, in order to identify a compound that would selectively- inhibit -P-GHS- 2 -g«ne-> 
product activity, as claimed. 17 



On appeal, the T CAFC affirmed .the district : .court's decision gr 



ju 3gffl"e"n't " invalidating" "the" "*85CT "patent" ?(5r" *f aihlre* "to" "ifigeT " the* " written" " description" 1 . \ 
requirement " "The analy : sis by" Hie jGSEC!H?L^ $ XX?. J&Kgaii hy jpartffioning ^ 
the firet.paragrapJai - ^ 

Three separate requirements are contained in that provision^ (l) "the « 
specification shall contain a written description of the invention"; (2) "the 
specification shall contain a written description ... of the manner and 
process of maMng and using it [i.e., the invention] in such full, clear, concise 

and exact terms-aa-to- enable- any-person -skilled- in- -the- -art- -to- -which- -it 

pertains, or with which it is most nearly connected, to make and use the 
same"! and (3) "the specification . . . shall set forth the best mode 
contemplated by the inventor of carrying out his invention." 18 



Deleted: C 



Formatted: Block Text (FN), Left, 
Indent: First line: 0", Une spadng: 
single 



Formatted: Block Text (FN), Left, 
Indent: Rrst line: <T, Une spadng: 
single 



Deleted: 



The ffAFC [. relied on Evans v. Ea top}* to aver that the Supreme .Court has 
recognized the existence of separate written description and enablement 
requirements, providing "two objects" for a patent specification, since at least 1822. 20 

Applying these principles to the case at bar, I conclude that, as a matter of law, 
the '850 patent does not comply with the writtenidescription requirement of § 112, 
and that defendants are therefore entitled to summary judgment on that issue. 
The patent does no more than describe the desired function of the compound 
called for, and it contains no information by which a person of ordinary skill in the 
art would understand that the inventors possessed the claimed invention. At 
best, it simply indicates that one should run tests on a wide spectrum of 
compounds in the hope that at least one of them will work. 
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Id 
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» Id. at 233 t 

What the '850 patent does not do, however, is provide the necessary link between 
those two steps' actually finding a compound that works. It provides precious 
little guidance in the way of selecting a particular compound, or even of narrowing 
the range of candidates in order to find a suitable compound without the need for 
undue experimentation. 

is Rochester H 358 F.3d 916. 921 (Fed. Cir. 200-4) . 
» Evans v. Eaton. 20 U.S. (7 Wheat.) 356 (1822). 
a > Rochester II 358 F.3d at 924 . The court- stated: 

Indeed, as early as 1822 the Supreme Court recognized the existence of separate 

written description and enablement requirements: 

The patent act requires . . . that the party [i.e., the inventor] Ishall deliver a 
written description of bis invention, in such full, clear, concise, and exact terms, 
as to distinguish the same from all other things before known, and to enable any 
person skilled in the art or science, &c. &c, to make, compound, and use the 
same." The specification then has two objects: one is to make known the manner 
of constructing the machine (if the invention is of a machine) so as to enable 
artisans [sic] to make and use it, and thus to give the public the full benefit of the 
discovery after the expiration of the patent .... The other object of the 
specification is, to put the public in possession of what the party claims as his own 
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The court distinguished the modern written description requirement from 
enablement on" the basis of a" close simflanty of thelanguage between "the" Patent Acfs£ - 
of 1793 and 1952. 21 The court further stated that enablement alone is not sufficient 
because "an invention may be enabled even though it has not been described," 22 and 
presented ^Tonbwing"hyp6th-e-tlc^ 
related T invention A that is botfr tie'Scrtbed ^ 

inventiohB i£B w^ V 

Without a description of the compound employed in their method, the court V 
found that it- would be- impossible -to- practice- the- claimed-me thod- of -treatment. 2 ^- - Like-^;, 
the district court,- the ffiVFG stated: that the inventor -m^st ^establish that he r wasr in- v; 
possession of the claimed invention, even if reduction to practice is only 1 
constructing , 26 Summary judgment for .failure to meet thg .written description 
requiremeri'o f't^e"^ 

specifically declined^to consider- enablement, 2 - 7 - 

Petitions for rehearing and for rehearing en banc ^tochesterJJ decision were 
denied. 28 r The" * Order," issued" "July " "2; " "2004;* "was 'accdmpamed" by* two concuntag" 
opinionsand 

opinion on appeal to the T CAFC . concimed in denymg 

that a written description requirement separate from enablement has always been 
required and that a revision of its interpretation is not warranted. 29 Judge Dyk 

invention, so as to ascertain if he claim anything that is in common use, or is 
already known, and to guard against prejudice or injury from the use of an 
invention which the party may otherwise innocently suppose not to be patented. 
MAquoting Evans v. Eaton, 20 U.S. (7 Wheat.) 356, 433^34 (1822)). 
21 Rochester IT, 358 F.3d at 925 ("Although the patent statutes have been extensively revised 
since 1822 [when Evans v. Eaton was decided], most notably in the addition of the requirement of 
claims, the language of the present statute is not very different in its articulation of the written 
description requirement.''). 
22/rf.at 921. 
v*Id. 

u Id. at 926 . The court noted that: 

Regardless whether a compound is claimed per se or a method is claimed that 
entails the use of the compound, the inventor cannot lay claim to that subject 
matter unless he can provide a description of the compound sufficient to 
distinguish the infringing compounds from noninfringing compounds, or 
infringing methods from noninfringing methods. As the district court observed, 
^the claimed method depends upon finding a compound that selectively inhibits 
PGHS:2 activity. Without such a compound, it is impossible to practice the 
claimed method of treatment.1 

26 Id. i(" Constructive reduction to practice is an established method of disclosure, but the 
application must nonetheless 'describe the claimed subject matter in terms that establish that [the 
applicant] was in possession of the . . . claimed invention, including all of the elements and 
hmitations. ' > (QLfQ^gHvatt v. Boone. 146 F.3d 1348. 1353 (Fed. Cir. 19980. 

26 Id. at 929- 

w Id. at 929^:30 ("In view of our affirmance of the district court's decision on the written 
description ground, we consider the enablement issue to be moot and will not discuss it further.''). 

28 Univ. of Rochester v G.D. Searle & Co.. 375 F.3d 1303 . 1304 (Fed. Cir. 2004) [hereinafter 
Rochester IlA . 

29 Id at 1305^:07 (Lourie, J., concurring),. 

Contrary to the assertions of the appellant, certain amici, and some of the 
dissenters, there is and always has been a separate written description 
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agreed with Judge Lourie that 35 U.S.C. § 112 contains a separate written 
description requirement and further stated that, although the court has "yet to 
articulate, satisfactory, standards, that.can.be. applied to .all .technologies", .ia .enforcing. 
this requirement, Rochester II was not the appropriate case for en banc consideration*"' 
of this issue . 30 Judg£ Newman; * in - a -dissenting- opinion,- supporte d the* holding,- but- \ 
argued £br*ltf!teartlfff 
judges. 31 

Judge Linn, with whom Judges Rader and Gajaxsa joine.d,.dUssente.d,.steting^in.. 
*dar-ect -contravention-^the -opinions of Judges ^ewmanj-^uriei; ^nd : Dyk (: that -the- y 



1 • 

without enablement and best mode, there is no standard by which to measure \ 
writte n description and, further, given that £he purpose of the claims is to set forth ' 
the metes and bounds of the invention, there is no reason to require a written 
description that is distinct from enablement. 33 „ 



M 



30 Id. at 1307 (Dyk, J., concurring). 

31 -A'/- at 1304 (Newman, J., dissenting) . Judge Newman stated: 

I fully share Judge Lourie's understanding of the law. The continuing 
attack on well;established and heretofore unchallenged decisions such as 
Vas;Cath, Inc. v. Mahurkar . . . and earlier cases such as In re Ruschig. . . is not 
only unwarranted, but is disruptive of the stability with which this court is 
charged. If precedent has become obsolete or inapplicable, we should resolve the 
matter as a court and again speak with one voice. 
Id. (citations omitted). «■ «• 

32 Id. at 1305 (Linn. J., dissenting) . The Judge Linn dissent states in part: 

Section 112 of Title 35 of the United States Code requires a written 
description of the invention, but the measure of the sufficiency of that written 
description in meeting the conditions of patentability in paragraph 1 of that 
statute depends solely on whether it enables any person skilled in the art to which 
the invention pertains to make and use the claimed invention and sets forth the 
best mode of carrying out the invention. 



IcL 



33 Id at 1306. The judge further dissented: 

Construing § 112 to contain a separate written description requirement 
beyond enablement and best mode creates confusion as to where the public and 
the courts should look to determine the scope of the patentee's right to exclude. 
Under the panel's analysis, a court looks to the written description to determine 
the parameters of the patentee's invention - under guidelines yet to be articulated 
- and then determines if the claims, as properly construed, exceed those 
parameters .... There is simply no reason to interpret section 112 to require 
applicants for patent to set forth the metes and bounds of the claimed invention in 
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requirement in the patent law. The requirement to describe one's invention is 
basic to the patent law, and every patent draftsman knows that he or she must 
describe a client's invention independently of the need to enable one skilled in the 
relevant art to make and use the invention. The specification then must also 
describe how to make and use the invention (i.e,, enable it), but that is a different 
task. x 

In sum, I concur in the decision of the court not to rehear this case en banc. 
Our precedent is clear and consistent and necessitates no revision of written 
description law. 
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Judge Rader, who also dissented, and with whom Judges Gajarsa 
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possession, as of the filing date of ^an application, of the__subject matter later 
claimed. 34 

Judges Rader stated that the hypothetical 36 wherein "a patent can enable an 
invention that is not described," which was p resented by the panel, "rarely, if ever, 
happens. No actual case presents the hypothetical." 36 He concluded that "ample 
remedies" exist to address the "TZiocires^e^^ .^yP9J^© ^cal"_ _ in_ _ the _ absence. . o?. . 
separate written description requirement 37 
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III. .Historical Development of the Written Description 
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The Patent Act of 1790 33 included in Section 2 a requirement that: 




*f3Qhe jgrantee 1 or .grantees pf each 
same, deliver to the Secretary of State a specification in writing, containing 
a description, accompanied with drafts or models, and explanations and 
models ... of the thing or things, by him or them invented or discovered . . 



^The Act farther requ^ed- 

[The] specification shall be so particular, and said models so exact, as not 
only to distinguish the invention or discovery from other things before 
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two separate places in the application, 
claims, (citations omitted) r 



That is the exclusive function of the 
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*• Id. at 1311 (Rader, J., dissenting) . Judge Rader noted in his dissent- 

Beginning in 1967, this court and its predecessor applied the written description 
language to achieve this vital purpose of the Patent Act - tying disclosure to the 
time of invention, jln the words of Judge Rich, the first judge to use the 
description requirement to police priority. "The function of the description 
requirement is to ensure that the inventor had possession, as of the filing date of 
the application relied on, of the specific subject matter later claimed by him"_ 
Id (emphasis added) (citations omitted). 

35 Rochester II. 358 F.3d 916. 92 (Fed. Cir. 2004). 

36 Rochester III 375 F.3d aL1312_(Rader, J M dissenting). 

37 Id at 1313 (Rader, J., dissenting) ("In sum, our patent law (and the world's patent law) has 
worked well for 200 years because the law already possesses ample remedies for the Rochester 
hypothetical, which, as a practical matter, never occurs."). 

38 Patent Act of 1790, ch. 7, 1 Stat. 109:12 (April 10, 1790) (repealed 1793) (current version at 
35 U.S.C. 3 j 112(2000)) . 

a» See i d § 2. 
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known and used, but also to enable a workman or other person skilled in 
the art or manufacture, whereof it is a branch, or wherewith it may be 
nearest connected, to make, construct, or use the same, to the end that the 
public may have the full benefit thereof, after the expiration of the patent 
term 40 
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According to the literal language of the Patent Act of 1790, therefore, the 
specification was required to "distinguish the invention or discovery from other 
things before known and used." Separately from the requirement of distinguishing 
the invention from other things previously known and used, the specification was 
required to "enable a workman or other person skilled in the art or manufacture . . . 
to make, construct, or use the same." The phrase, "to the end that the public may 
have the full benefit thereof, after expiration of the patent term," follows from the 
second requirement of the specification, enablement, .bee 

invention from other things before known would not be a benefit to the public that 
would continue after expiration of the patent. 

The Patent Act of 1793 41 retained dual requirements of a description that 
distinguished the invention from things previously known and of enablement by any 
person skilled in the art to make and use the invention. Section 3 of the Act stated* 
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[E]very inventor . . . shall deliver a written description of his invention, and 
of the manner of using, or process of compounding the same, in such full, 
clear and exact terms, as to distinguish the same from all other things 
before known, and to enable any person skilled in the art or science, of 
which it is a branch, or with which it is most nearly connected, to make, 
compound, and use the same. 42 



*4 { Formatted: Indent: First line: 0" ) 



Therefore, the Patent Act of 1793 still required that the written description 
distinguish the invention from things previously known and, , separately, .enable .one _ 
skilled in the art to make and use the invention. 

In Section 6 of the Patent Act of 1836, 43 the requirement that the specification 
distinguish the invention from "other things before known" was eliminated, leaving 
the written description with the sole requirement "as to enable": 

r T 

[H]e shall deliver a written description of his invention or discovery, and of 
the manner and process of making, constructing, using, and compounding 
the same, in such full, clear, and exact terms, avoiding unnecessary 
prolixity, as to enable any person skilled in the art or science to which it 
appertains, or with which it is most nearly connected, to make, construct, 
compound and use the same . . . , 44 
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"Id. 

« Patent Act of 1793, ch. 11. 1 Stat. 318:23 (February 21, 1723) (repealed 1836) (current 
version at 35 U.S.C. S 112 (2000)* . 
« Id. i 3. 

<* Patent Act of 1836, ch. 357, 5 Stat. 117 (July 4, 1836) (repealed 187Q) (current version at 35 
U.S.C. S 112 (2000)) . 
« Id £ 6. 
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same section of the Act, that the inventor "shall particularly specify and point out the ' '■ — ' 

part, improvement, or combination, which he claims as his own invention or 
discovery." 45 

Section 26 of the Patent Act of 1870 46 incorporated language relating to the 
requirements of a written description nearly identical to that of the Patent Act of 
1836: 

That before any inventor or discoverer shall receive a patent for his ♦ { Formatted: Indent: First line: 0 W ) 

invention or discovery, he shall . . . file in the patent office a written 
description of the same, and of the manner and process of making, 
constructing, compounding, and using it, in such full, clear, concise, and 
exact terms as to enable any person skilled in the art or science to which it 
appertains, or with which it is most nearly connected, to make, construct, 
compound, and use the same . . . . 47 

The Patent Act of 1870, also like that of 1836, required the inventor to 
particularly point out what he claimed as his invention or discovery, but included the 
additional criterion that he "distinctly claim" his invention^ . . and he shall 
particularly point out and distinctly claim the part, improvement, or combination 
which he claims as his invention or discovery . . . ," 48 

That the inventor was required by the literal language of the statute to 
"distinctly claim" the invention, apart from providing a specification that included an 
enabling written description, is evidenced by the last phrase of Section 26, which 
separately identifies the specification and claim: "... and said specification and 
claim shall be signed by the inventor and attested by two witnesses." 49 

Under the Patent Act of 1952,&o United States Code Title 35, Section 112_£&_._.-- { Formatted: Font: Century ) 
112") . the term "specification" embraces both a "written description of the invention," 
and "claims," in the first and second paragraphs, respectively, and includes much of 
the language of Section 26 of the Patent Act of 1870. The first and second 

paragraphs of 35 U.S.C. & 112 are . as L jbUows: {"Formatted: Font: Century ] 
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The specification shall contain a written description of the invention, rzr — \ 

and of the manner and process of making and using it, in such full, clear, I Formatted: Indent: First line: 0.2 — J 

concise, and exact terms as to enable any person skilled in the art to which 
it pertains, or with which it is most nearly connected, to make and use the 
same, and shall set forth the best mode contemplated by the inventor of 
carrying out his invention. 



« Patent Act of 1870, ch. 230, 16 Stat. 198-217 (July 8, 1870) (repealed 1952) (current version 
at 35U.S,q t § 112 (2000)) . 
26. 

" Patent Act of 1952, ch. 950, § 1, 66 Stat. 798 (July 19, 1952) (current version at 35 U.S.C. S 
112 (2000)) . 
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Evans v. Eaton, 52 derided- m-l-822;"was"rited- by 
recognition by the Supreme Court of a written description requirement separate from 
enablement since the Patent Act of 1793. 53 Judge Rader, in his dissent "from "the 
decision to deny rehearing, stated that the Patent Act of 1793 required that the 
written description both distinguish the invention from "other things before -known- 
arid "enable- any- person -skilled -in the -art" 64 - - He- suggested- that the- court overlooked" 
the significance of the fact that in 1822 there was no statutory requirement to 
separately claim the invention 55 and that omitting the language "to distinguish the 
same from all other things before known" in later acts, in conjunction with a 
statutory requirement to include claims, could only mean that "enablement became 
the sole 35 U.S.C. § 112, If 1 standard for adequate disclosure of an invention Z' 66 

The literal language of the portion of Section 3 of the Patent Act of 1793 
requiring inventors to "deliver a written description of his invention, as to 
distinguish the same from all other things before known ..." was interpreted in 
Evans in view of Section 2, which was directed to the rights of inventors of "original 

51 35 U.S.C. § 112 (3000) . 

52 Evans v. Eat-on. 20 U.S. (7 Wheat.) 356 (1822), 
63 See supra Section TI. 

5 < Rochester III, 375 F.3d 1303. 1309 (Rader. J., dissenting). In his dissent. Judge Rader 
reasoned- 

In 1793, the Patent Act, 1 Stat. 318, required an inventor to describe the 
scope of the invention in the body of the specification; the Act did not require any 
claims. Instead the Act required the inventor to provide "a written description of 
his invention, and of the manner of using, or process of compounding the same, in 
such full, clear, and exact terms, as to distinguish the same from an- other things 
before known, and to enable any person skilled in the art or science ... to make, 
compound and use the same." 
Id. (citations omitted). 

55 Id. at 1310 ("For obvious reasons, Rochester undertakes no further explanation of the 
Supreme Court's language. In simple terms, the Supreme Court could not have meant that the 
written description portion must provide adequate support for the claims as this court's law 
presently requires. Patents did not even contain claims in 1822."). 
66 Id Judge Rader further explained: 

The Supreme Court clearly linked its pother object]! of the specification disclosure 
to the portion of the statute requiring the inventor Ito distinguish the same from 
all things before known.1 Significantly, that language no longer appears in 35 
U.S.C. § 112. Later, in 1870, the Patent Act first articulated the requirement that 
applicants define their exclusive right in a distinctly drafted claim. Only one 
logical conclusion flows from this history. When the Patent Act assigned the 
notice function to claims rather that the written description, enablement became 
the sole 35 U.S.C. § 112, TI 1 standard for adequate disclosure of an invention. 
Id (citations omitted). 
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discoveries" relative to those who patent 
court in Evans 7 - ------- 



"improvements." 57 As stated by the lowers 
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If the alleged inventor of a machine, which differs from another previously 
patented, merely in form and proportion, but not in principle, is not entitled 
to a patent* for an mipro vem*e*nt," "which - he - cannot" be" hy the* 2d section of the - * 
law, he certainly cannot, in a like case, claim a patent for the machine 
itself. 5 ? , 
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The lower court and the opinion by the Supreme Court addressed whether, in | \ 
the case of an improvement on a flour mill called a "Hopperboy," the inventor wag__|^fo ■* 
entitled to the improved invention as a whole or only the part of the invention that \Vffi\ Hanging: 0.3", Right: 

represents the impcovement. .Further^ both .the. lower .court, and the. .Supreme. .Court • De|rted . t ' w) 

discussed at length whether an improvement must be identified by the inventor as \ ; ftfoflj e * a — ' 
such. The lower court stated the *invent6f- 'must-be- -awaTe-of-the-briginal;^^! ^ Deleted: 1822 U.S. LEXIS 266, 7 
explaining that to hold otherwise would grant overly broad protection. 60 However, Deleted 

^ same obUgation to Deleted: (Use 20 U.S. 356 (r rs81 

"improvement is not necessary, according to the lower court, where the invention is 1\\\\^f ^^i 1K 
an "original machine." 61 

Counsel for plaintiff, in defense of validity of the patent, argued strenuously that 
the subject invention of the patent need not be compared to "things before known or 



I 



« Patent Act of 1793. ch. IJ. S2. 1 Stat. 318-23 (February 21. 1793) (repealed 1836) (current 
veraionat 35 U.S.C. S 112 (2000)).- 

Provided always, and be it further enacted, That any person, who shall have 
discovered an improvement in the principle of any machine, or in the process of 
any composition of matter, which shall have been patented, and shall have 
obtained a patent for such improvement, he shall not to be at liberty to make, use 
or vend the original discovery, nor shall the first inventor be at liberty to use the 
improvement* And it is hereby enacted and declared, that simply changing the 
form or the proportions of any machine, or composition of matter, in any degree, 
shall not be deemed a discovery. 

Id 

sa Evans v. Eat-on. 20 U.S. (7 Wheat.) 356. 361-62 (1822) . 

59 Id. at 367 ("The answer to this is, that an improvement necessarily implies an original, and 
unless the patentee is acquainted with the original which he supposes he has improved, he must 
talk idly, when he calls his invention an improvement."). 

60 Id. at 362 ("Because if that superiority amounts to an improvement, he is entitled to a patent 
only for an improvement, and not for the whole machine. In the latter case the patent would be too 
broad, and therefore void when the patent is single"). 

e* Id. at 3B7--68. The court stated: 

If he knows nothing of an original, then his invention is an original, or 
nothing' and the subsequent appearance of an original to defeat his patent is one 
of the risks, which every patentee is exposed to under our law. As to the supposed 
distinction between an improvement on a machine patented, and one not so, there 
is nothing in it. In both cases the improvement must be described, but with this 
difference'-That in the former case it may be sufficient to refer to the patent and 
specification, for a description of the original machine, and then to state, in what 
the improvements, or such original consists '^whereas, in the latter case, it would 
be necessary to describe the original machine, and also the improvement. The 
reason for this distinction is too obvious to need explanation. 
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used," but rather, that the specification need "merely to distinguish new from old." 62 ^ 
PlaintiiFscou^ 

to rights associated with improvements on previously patented inventions, while 
Section 3 spoke to substantive requirements of the written description of any patent, 
regardless of whether it represented an "improvement" under Secti.oiv2.ff 

Defendant's counsel argued the opposite, that if the invention was directed to an*> 
improvement^ Jhe specification must identify the improvement. 64 Failure to do so \ 
meant either . that .the. .pa teut .was. xoiclfiither. .because, .the. iny.entiau .was . not .new. .or^\ 
because the patent was overly broad by not distinguishing the point of novelty. 66 * 

Justice Story, writing for the Supreme Court, referenced ^efctiog- frof ;the: Patent; 3 
Act of 1793, stating that an inventor is not entitled to use an "original discovery" \ 
where the invention lies in an improvement of what was previously known, "nor is 
the first inventor at liberty to use the improvement." 66 Further, the Court stated 



M 



62 Id. at_.37^^PlaiDtifya.cauniseUss&rted:. 

It was a second error of the Court, to take it for granted, that the improved 
Hopperboy was not so described in the specification, as to distinguish it from all 
things before known or used, and to enable a person skilled in the art to make it. 
It is so described .... No one skilled in the art could misapprehend this 
description or be misled by it ... . The law does not require of patentees to 
describe new and old, but merely to distinguish new from old. Otherwise a patent 

r. . . . .would he jnore complex .and .voluminous, than, a .Welsh pedigree 



63 Id. at 369- 70. The plainttifi'AmunseLftulher.natedL 

Indeed, it may well be doubted whether any discrimination is necessary, 
where, as in this case, there is but one patent in existence. The second section of 
the law speaks of the case of a prior patented machine. The Court would have the 
third section to be substantive, without association with the second and sixth 
[regarding defenses to infringement such as lack of novelty]. But how can a 
patentee describe what he never saw? 
M at- 376-77, 

64 Id at 358. Counsel for the defendant stated: 

The patentee ought, in his specification, to inform the person who consults it, 
what is new and what is old. He should say, my improvement consists in this, 
describing it by words, if he can, or if not, by reference to figures.. ... All that is 
contended, and that is fully supported by authority, and by the reason of the case, 
. is, that the specification must, in some way or other, distinguish the new from the 

old, the improvement from what was known before, so as to show what the 
patented invention is, or else the patent is broader than the invention, and void. 
Id. at 389-90. 

« Id. at 358. The court further noted: 

Hit is confidently submitted, that the patent of Oliver Evans must be considered 
as a patent either for the machine or for the improvement. 

That if it be for the machine, it is void, because it is fully proved that he was not 
the original inventor, but the machine was known and used before. 
That if it be for an improvement, it is void, because it is broader than his 
invention, and does not specify in what his invention consists, so as to distinguish 
it from what was known and used before. 
KM 394. 

66 Id at 429 . The majority set out- that' 

The Patent Act of the 21st of February, 1793, ch. 11 upon which the validity of our 
patents generally depends, authorizes a patent to the inventor, for his invention 
or improvement in any new and useful art, machine, manufacture or composition 
of matter not known or used before the application. It also gives to any inventor 
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that, according to the statute, a mere change in form or proportion will not be 
considered to be a discovery. 67 The Court also cited Section 3 of the Patent Act of 
1793, and stated that the "specification, then, has two objects" • 
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The third section of the patent act requires, as has already been stated; 

: tHat : the :: ^ ^ V V C 

' full," clear,~£na exact" terms," as" to "o^sting^isK "the same'&om aU"oth"e"r" Ring's "4v,\ N 

before .know . [sicL and to. enable, any. person ^ JoUed in the .art or science,. .&c. 

&c. to make, compound, and use the same/ The specification, then,.has.two 

objects* one is to make known the manner of constructing the machine (if 
the invention is of a machine) so as to enable tiffin s [sic] to make and use 
it, and thus to give the public the full benefit of the discovery after the 
expiration of the patent .... The other object of the specification is, to put 
the public in possession of what the party claims as his own invention, so as 
to ascertain if he claims anything that is in common use, or is already 
known, and to guard against prejudice or injury from the use of an 
invention which the party may otherwise innocently suppose not to be 
patented. 68 
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Sections 2 and 3 of the Act were summarized by stating that the inventor's 
patent is valid for a whole machine "only by establishing that it is substantially new 
in its structure and mode of operation " 69 

If, then, the plaintiff is not entitled to a patent to the invention as a whole, the 
question, according to the Court, became whether the patent was valid as an 
improvements Relying again on Section 3 of the Patent Act, which required that 
the inventor "shaU deliver a written description of his invention . . . , in such full, 
clear and exact terms, as to distinguish the same from all other things before known, 
and to enable any person skilled in the art or science ... to make, compound, and use 
the same," the Court held that the specification must identify the improvements, and 
to limit the patent to the improvements. 71 The Court affirmed the judgment of the 

of an improvement in the principle of any machine, or in the process of any 
composition of matter which has been patented, an exclusive right to a patent for 
his improvement; but he is not to be at liberty to use the original discovery, nojrl 
is the first inventor at liberty to use the improvement 

Id 

67 Id. ("It also declares that simply changing the form or the proportion of any machine or 
composition of matter, in any degree, shall not be deemed a discovery."), 
a Id at 433^:34. 

» Id. at 430 . The majority noted: 

From this enumeration of the provisions of the act, it is clear that the party 
cannot entitle himself to a patent for more than his own invention; and if his 
patent includes things before known, or before in use, as his invention, he is not 
entitled to recover, for his patent is broader than his invention. I£ therefore, the 
patent be for the whole of a machine, the party can maintain a title to it only by 
establishing that it is substantially new in its structure and mode of operation, 



'» Id. at -132-33. 

71 Id. at 434-- 35 . The court asserted that: 

The specification must describe the invention lin such full, clear, and distinct 
terms, as to distinguish the same from all other things before known. 1 How can 
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circuit court in vali dating the patent because the patent did not specifically identify 

the improve ments over a.pjreviously known Hopperboy , as .such.?* 

Justice Livingston, in a dissenting opinion, disputed the interpretation of the 
statute by the majority.^ Specifically, he stated that the law does not require 
identifying in what particulars the patented improvement lay. 74 Justice Livingston 
stated that, if the improved machine is distinguishable upon comparison with known*.. 
mac]jines, a tHe wordsand "the objects jpr the invention are]saffsfied,^^ [and] ffiatHhose*; 
objecte"axe"Jb^ite<^ 

enablement of the public to practice the invention upon expiration of the patent: *j 

The law appears to have nothing else in view, in requiring a specification, 
then [sic] the instruction of the public; that is, to guard them against a 

x. . . violation. . of . the . patented, improvement, . .and. .to. . enable. . them,. . whan . ihe 

letters-pate nt-expire,- from -the -specification, filed, -ta make -a machine- similar & 

to the one which had been patented. The only inquiry, therefore, ought to 
be, whether this obvious intention of the legislature has been answered by 
the particular* specificaHdn " wHicn may "be" the " subject of litigation] " and "if * " " 
enough appears, either to prevent a person from encroaching on the right of 
the patentee, or to enable a skillful person to make a machine which shall 
not only resemble the one patented, but produce the like effect; more ought 
not to be required. 76 



that be a sufficient specification of an improvement in a machine, which does not 
distinguish what the improvement is, nor state in what it consists, nor how far 
the invention extends? Which describes the machine fully and accurately, as a 
whole, mixing up the new and old, but does not in the slightest degree explain 
what is the nature or Emit of the improvement which the party claims is his own? 
It seems to us perfectly clear that such a specification is indispensable. We do not 
say that the party is bound to describe the old machine; but we are of opinion that 
he ought to describe what his own improvement is, and to limit his patent to such 
improvement. 



72 Id. at 435 . The court finally stated: 

We do not consider that the opinion of the Circuit Court differs, in any 
material respect, from this exposition of the patent act on this point; and if the 
plaintiffs patent is to be considered as a patent for an improvement upon an 
existing Hopperboy, it is defective in not specifying that improvement, and 
therefore the plaintiff ought not to recover. 

TA Id. at 441 (Livingston. J., dissenting), 

74 Id. . In his dissent. Justice Livingston atnted^ 

We have seen already that the law prescribes no precise form of 
specification, which would have been impracticable, and imposes no obligation to 
describe, in any particular mode, the machine in question. Not a word is said as 
to showing in what particulars the improvement patented differs from all other 
machines for the same purpose then in use. 

IdL 

75 Id. ( u li on the whole description taken together, the machine of the plaintiff can be 
distinguished from other machines when compared with his, the words and the objects of the law 
are satisfied."). 

™Id. at 441-42. 
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The issues presented by Evans, then, were whether the plaintiffs patent was 
directed to a machine that was original or an improvement of a previously^known 
machine and, if an improvement, whether the specification required delineation of \ 
the features that constitute the improvement from among those in the specification. • 
The majority held that the p laintiffs invention was an improvement on that which 
was previously known and that the inventor was obligated to identify in the 
description where the improvement lay.^z The dissent, on the other hand, stated that 
so long as the invention, as described, was distinct from previously known devices, 
the specification was adequate because the law did not require a list of features that 
specifically constitute the improvement.^ 

Although the majority and dissenting opinions in Evans differed as to whether 
the specification, where the patented invention represents an improvement, must 
specifically state where the improvement lay, or whether it is sufficient merely to 
describe an invention that is, in fact, different from that which has gone before, both 
opinions agreed on the policy objectives of the statute setting the requirements of the 
specification. In particular, both opinions agreed that the objectives of the statute 
included enabling the public to benefit from the inventors' discovery after expiration 
of the patent, and putting the public on notice as to the scope of the invention as a 
warning against encroachment. 

The issue in Evans was intimately tied to a statutory requirement that the 
specification distinguish the invention from "all other things before known." As 
discussed above „ the Patent Act of 1836 did not include this language, but rate ad m . 
required that the inventor "particularly specify and point out tie part, improvement, 
or combination, which he claims as his own invention or discovery "SQ The explicit 
requirement in the Patent Act of 1870 that the inventor "shall particularly point out 
and distinctly claim the part, improvement, or combination which he claims is his 
invention or discovery," 81 met the "other object" of the specification identified in 
Evans, of putting "the public in possession of what the party claims as his own 
invention, so as to ascertain if he claimM anything that is in common use, or is 
already known," and guarding "against prejudice or injury from the use of an 
invention which the party may otherwise innocently suppose not to be patented." 82 
Further, the requirement of the second paragraph of 35 U.S.C. § 11^ that "the _ 
specification shall conclude with one or more claims particularly pointing out and 
distinctly claiming the subject matter which the applicant regards as his invention^ j 
is consistent with denial of protection to an applicant whose specification does not 
call out which among the listed features constitutes an improvement over things 
before known, as was the case in Evans. 

Contrary to the jQAFC's analysis in RochesterJI, support for a "separate" written 
description requirement under the Patent Act of 1952 cannot be found by drawing a 
parallel with the Patent Act of 1793 because the "other object" of the written 
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description requirement of the specification was met by the introduction of a 
statutory requirement for claims. 

2. 'Merely Colorable Differences": Relevant Developments in Patentability and 

Infringement Prior to the Patent Act of 1952 

At least as early as 1814, a parallel was drawn between patentability and 
infringement in view of the "principle" or "mode of operation" of an invention or 
machine. As stated by Circuit Justice Story in Odiorne v. Winkley**- 

The first question for consideration is, whether the machines used by the * 
defendant are substantially, in their principles and mode of operation, like 
the plaintiffs machines. If so, it was an infringement of the plaintiffs 
patent to use them, unless some of the other matters offered in the defense 
are proved. Mere colorable alterations of a machine are not sufficient to 
protect the defendant. The original inventor of a machine is exclusively 
entitled to a patent for it. If another person invents an improvement on 
such machine, he can entitle himself to a patent for such improvement only, 
and does not thereby acquire a right to patent and use the original machine; 
and if he does procure a patent for the whole of such machine with the 
improvement, and not for the improvement only, his patent is too broad and 
therefore void. It is often a point of intrinsic difficulty to decide whether 
one machine operates upon the same principles as another_. . . . The 
material question, therefore, is not whether the same elements of motion, or 
the same component parts are used, but whether the given effect is 
produced substantially by the same mode of operation, and the same 
combination of powers, in both machines. Mere colorable differences, or 
slight improvements, cannot shake the right of the original inventor.™ 

Therefore, the court in Odiorne stated that when the "given effect is produced 
substantially by the same mode of operation" as that of a patented machine, the 
differences between the machines are "mere colorable differences, or slight 
improvements" that do not protect against a finding of infringement.^ By stating 
that such differences "cannot shake the right of the original inventor," the court 
further suggested that identification of the principle of an improvement was a test of 
both infringement of the patent and of the patentability of the improvement. 

In Gray v. James? 6 decided in 1817, the Circuit Court for the District of 
Pennsylvania held in an infringement action that a claimed invention by Perkins was 
not shown to operate by the same mode as a machine described in an earlier patent 
by Chandler. In particular, the Chandler patent was not a reference against Perkins 
because the description in the Chandler patent was insufficient to determine whether 
they operated by the same mode, or principle : 
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83 Odiorne v. Winklev. 18 F._Cas. 581 (CUD. Mass, 1814). 
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But the important difference is in the mode of operating. Perkins'I]^achine 
makes the nail by one and the same pressure of the lever, Chandler's, so far 
as the court can perceive, effects nothing more by the pressure of the lever, 
- * - - than -the- cutting- of- the -nail- rod; -but»- by -what -power- -the - side -or- -horizontal - 
levers which form the head, are moved, does not appear, otherwise than as 
it is stated in the specification, to be by the action of what is called 
secondary levers, or the axes of a wheel during its revolutions. But, by 
what power are these secondary levers or wheel [s] worked^. . : J? ha . shorty 
the court finds it impossible to discover in what manner the complicated 
parts of this machine are worked, beyond the pressure of the lever which 
cuts the nail .... The one operates by means of a single power; the other by 
the aid of more than one power. Or, if this be not so, it behooves the 
defendants clearly to show the contrary, before the court can listen to a 
motion to set aside the verdict, on the ground that the two machines are 
substantially alike in principle. 87 
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Therefore, the general description in an earlier patent^ to Chan^ 
nail by an apparatus that, like Perkins', nail _ cutter, .mcluded a lever to 
did not foreclose Perkins' light to a patent. In other words, until the mode of 
operation by which the device disclosed by Chandler could be shown, the court 
assumed that the principle of operation was different, thereby obviating the 
challenge to Perkins' right to a patent to the nai^making machine .described. _ 

In Davis v. PaImer H decided by the 
years after Evans, Circuit Justice Marshall held, in an action for infringement of a 
patent directed to a mould^board of a ..plow, , that although ^ 
improvement was clearly stated in the specification as that of working the 
mould^board "to circular or spheric lines," as opposed to .straight lines ^ 9 the : patentee _ 
was limited to the particular embodiment shown.^o Nevertheless, the court 
responded to requests by the defendant to modify jury instructions relating, to 
infringement^© the standard by which __p_atentabihty^ 

determine d^and to s^ the specification o f the pla in jnfTs 

patent. With respect to infringement, the court stated that* 

The patent, undoubtedly, covers only the improvement precisely described. * * 
But if the imitation be so nearly exact as to satisfy the jury that the 
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87 Id at 1020. 

89 Davis v. Palmer. 7 F. Cas. 154, 157 (CCD. Va, 1827) ("[Ilnstead of working the moulding 
part, or face of the mould;board to straight lines, my improvement is to work it to circular or spheric 
lines." (quoting the plaintiff)). 

*> Id 

We are then decidedly of opinion, that a mould;board conforming to the particular 
description contained in the specification, is the invention which the plaintiff 
claims, and that instead of being a mere illustration of the principle stated in the 
introductory part of the specification, it is itself the essential improvement, of 
which only a general idea was given in the introductory part. 

M 
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imitator attempted to copy the model, and to make some almost 
^ ' "^percep^Te" varlaGb^V ror'tHe giirpo^of evading the ngjhY oFfhe" P ate ii tee » * ^ 
this may be considered as a fraud on the law, and such slight variation be 
disre garded- 9 . 1 . «l 

Similarly, with respect to patentability, the court stated that not every change in 
form or proportion is fatal, despite the language of Section 2 of the Patent Act of 
1793: 

In construing this provision [of Section 2 of the Patent Act of 1793], the *. 
word "simply," has, we think, great influence. It is not every change of form 
and proportion which ios [sic] declared to be no discovery, but that which is 
simply a change of form or proportion, and nothing more. If, by changing 
the form and proportion, a new effect is produced, there is not simply a 
change of form and proportion, but a change of principle also. 92 

With respect to the written description, the defendant requested that the jury be*., 
instructed that it "must be satisfied that the former mould^board is i. described with . 
sufficient certainty, to distinguish between it and the improvement claimed." 93 In ' 
contravention to the defendant's request, the court found that a description that 
distinguished between a former moulc^board and . the L.patentee/s improvement 
unnecessary Rather, only a general reference to moulo^boards, or one commonly ' 
known, was required, along with a description of patentee's improvement "as will 
enable a workman to distinguish what is ne w^. 9 ^ ^ i 

Therefore, even under the Patent Act of 1793,^ which predated any requirement 
for claims, and subsequent to Evans, the written description requirement was 
interpreted to require only that patentee provide a description of his improvement 
"enab[lingl a workman to distinguish what is new."sa Moreover, all three 
modifications to the jury instructions reflect an underlying policy of protecting an 
inventor's right against imitators in exchange for providing to the public notice and 
an enabling description of his improvement. The defendant's requested jury 
instruction requiring more of the written description, i.e., "that the former 



9' Id at 159. 

92 Id 
nid. 
M 

■* Id. The Court stated that: 

We do not think a particular description of the former mould-board is necessary. 
A general reference to it. either in general terms which are not untrue, or hv 
reference to a particular mould- board, commonly known, accompanied by such a 
description of the improvement as will enable a workman to distinguish what 
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new, will be sufficient 

[n See supra Section 1II.A. 
Dtivis. 7 F. Cas. at 159. 



Formatte d: Font: Century 
Deleted: 



\ ( Deleted: 



Formatted: Font: Century 



J 



Formatted: Font: Century, 8.5 pt ] 
k fl Formatted: Font: Century, 8.5 pt j 
,V\ { Formatted: Indent: First line: 0.3" 
Formatted: Font: Century 



i j Fom 
:\( Dele 



Deleted: text at note 40 et seg. ) 



\ Formatted: Font: Century, 8.5 pt ) 
fo f Formatted: Font: 8.5 pt ) 
jfc f Deleted: v. Palmer" 
$ { Formatted: Font: 8.5 pt 

f[ Deleted: at!54, 



\| Deleted: (CCD. Va. 1827) 



\[ Formatted: Font: Century, 8.5 pt 



\( Formatted: Indent: first line: 0" 



Formatted: Indent: Left: 0", Right: 



Deleted: 



Deleted: 



. { Deleted: 



Deleted: The court rufurther noted ^ 
H 

We do not think a particular 
description of the former mould* 



Deleted: board is necessary. A 
general reference to it, either in 
general terms which are not untrue, 
or by reference to a particular 
mould- 



Deleted: board, commonly known, 
accompanied by such a description 
of the improvement as will enable a 



workman to distinguish wha( mmm [751 



[1:160-20623" 



-John-Marshall-Review -of Intellectual Property-Law 



Deleted: 13 L.Ed. 683, 1850 
U.S. LEXIS 1507, 11 HOW 248 
(1851). 



Formatted: Font: Century, 8.5 pt 



Formatted: Font: Century, 8.5 pt 
Formatted: Indent: First line: 0.3" 



• v f Deleted: text at note 



42 



IPS 



Deleted: for 



Deleted: 



Deleted: at_."). 



mould^board is > described with suffice ntcer^ \ \\ \ Deleted: 

improve m^ent'craime^ Vwas" specifically "denied d~by~ the court." " " " " ft \ \Y Deleted: 

Infringeme.at.and.patentability.wexe.fi ' r 

in tke-Swreme-Court^ase-of-^^ 
enactment of the Patent Act of 18 3 6 .-^1 The ma 

door knob" made'of potter's clay was'voitfl'ori'ack of ingenuity or skili^thCTthwlte 
of an^ ordinary mechanic. 102 In the dissenting opinion, Justice Woodbury stated that, 
in his vie w, . "the .true test , of .its. being .patentable was. if the .invention, was new, . and 
better and cheaper -than, what preceded it."- 1 -°- 3 ..P-uxther,. Justice. Woodbury, state dihat 1 
such a test is the same as that employed to determine whether patented subject 
matter is sufficiently distinct so as to avoid infringement of an earlier patent, thereby 
applying the same standard to both infringement and patentability* "Whe never the 
kind of test adopted below is used otherwise than to see if there has been an 
infringement or not, it is to ascertain whether the invention is original or not, that is, 
whether it is a trifling change and merely colorable or not." 104 

Among the references relied upon for this proposition by Justice Woodbury-^ 
was Lowell v. Lewis, 106 which, like Odiorn e* based infringement on whether the 
accused device was "substantially" the same invention claimed by the patent 
holder. 108 In Lowell, decided by the Circuit Court of the United States for the First 
Circuit in 1817, infringement was determined by whether the pump described in a 
patent to the plaintiff, Mr. Perkins, was substantially the same as that of the 
defendant, Mr. Baker.i&l Whether the pumps of Mr. Perkins and Mr. Baker were, in 
fact, the "same invention" was determined by whether the differences were merely in 
"form or proportion," because such differences could not be the basis for a "new 
invention." 110 The reliance on Lowell by the dissenting opinion in Hotchkiss means 
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102 Hotchkiss, 52 U.S. at 267 . The majority stated'- 

iFlor unless more ingenuity and skill in applying the old method of fastening the 
shank and the knob were required in the application of it to the clay or porcelain 
knob than were possessed by an ordinary mechanic acquainted with the business, 
there was an absence of that degree of skill and ingenuity which constitute 
essential elements of every invention. In other words, the improvement is the 
work of the skillful mechanic, not that of therinventor. 
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»03 Id. at 268 (Woodbury, J., dissenting). 
io* Id. at 269. 

"» Id. 

106 Lowell v. Lewis. 15 F. Cas. 1018 (CCD. Mass. 1817). 
»Q8 Id. at 1019-20. 
^ Id. at 1021. 
no Id. 

Another (and under the circumstances of this case, probably the most 
material) inquiry is, whether the defendant has violated the patent right of the 
plaintiff and that depends upon the fact, whether the pumps of Mr. Perkins and 
of Mr. Baker are substantially the same invention. I say substantially the same 
invention, because a mere change of the form or proportions of any machine 
cannot per se be deemed a new invention. If they are the same invention, then 
Mr. Perkins, being clearly the first inventor, is entitled exclusively to the patent 
rights, although Mr. Baker may have been also an original inventor, for the law 
gives the right, as among inventors, to him, who is first in time T 
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that, as late as 1851, after the patent statute had been changed to delete the explicit 
irauirament.th^^ 

ibafore. .kno.wn " . at. least. .one__Supxe me. .Court . jus.ti.ee. .was. Jbingiag. .patentability . and" 
infringe ment o a whether the invent i on , as d escribe d by the patentee , differed from 
the prior art (in the case tff-patentabiHtyJ'dr^n^ 
infringement) in ways that were "merely colorable." 

I n Winans v.^Adam. 111 the Supreme Court focused on whether a patentee's claim 



to an invention should be limited to the literal form of the clai m, or whether it should 



embrace embodiments that were substantially the same, in principle and mode of 
operation.^2 The plaintiff had requested the circuit court to instruct the jury with 
respect to infringement, as follows- 

[T]hat what they had to look at was not simply whether, in form and « 
circumstances, which may be more or less immaterial, that which had been 
done by the defendant varied from the specification of the plaintiffs patent, 
but to see whether, in substance and effect, the defendants, having the ' 
same object in view as that set forth in the plaintiffs specification, had, 
since the date thereof, constructed cars which, substantially, on the same 
principle and on the same mode of operation, accomplished the same 
result. 113 

This language, according to the plaintiff, "was taken verbatim, nearly" from an 
instruction to the jury in an earlier case, Walton v. Pote, which was another 
instance where patentability was compared with infringement. 116 The plaintiff then \ 
cited several cases where patentability and infringement were determined by \ 
ascertaining whether a difference from an earlier device (in the case of patentability), 
or a difference from a device p i a patent (in the case of infringement), were 



differences in principle and mode of operation, or were merely either colorable \ 
differences or a substitution of mechanical equivalents.^ For example, with respect \ 
to patentability, the plaintiff referenced Huddart v. Grimshaw 111 Regarding 



Id 



i How.) 330 (1853). 



"i Winans v. Adam. 56 U.S. (1 
"*Id. at 338-39. 

1,3 Id. at 334 - (quoting the request by the plaintifl) (emphasis added). 

" 5 Id Specifically, as recited by the plaintiff in Winans with respect to Walton' 

It is in this case [Walton] that C.J. Tindal says, 'That if a man has by dint of his 
own genius and discovery, after a patent has been obtained, been able to give the 
public, without reference to the former one, or borrowing from the former one, a 
new and superior mode of arriving at the same end, there can be no objection to 
his taking out a patent for that purpose. But he has no right whatever to take, if I 
may so say, a leaf out of his neighbor's book, &c* 
Id. (citations omitted). 

"° V* at 3?7 r 

in Huddart. v. Grimshnw, 1 Web. Pat. Cases. 85^_"If the tube and the plate were the same, 
substantially, the difference being colorable only, then the patent was void, otherwise it was good; 
and the question was left to the jury, who found for the plaintiff." Winans v. Adam. 56 U.S. 330. 335 
(1853). 
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multiply without end, the forms in which that principle can be made to operate. I" 121 
This reasoning was applied by the plaintiff to his claimed invention in Winans- 



As in the case under discussion.' the moment a practical, scientific man 
is furnished with the idea of giving to the car a shape which will, by 
-dkpe^fling-with-4he-frainiM ^ 

infoedffctiorrifcrits l o ad, than it h as eve r b e e n macfe-before , h e can multipl y- 
without end the forms in which this principle can be made to operate. 122 
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The plaintiff concluded that the test of infringement must be whether the i 
accused railroad cars of the defendant; were "substa^ 

operation, within the plaintiffs patent." 123 \ 

^ g The defendant, on the other hand, argued that an invention; "as "cl*a"imed,"is^\ f\\\Y Formatted: Font-Italic 
con2nie]cYto th^ f'V r ^ 

and the same. 12 * in other words, the defendant argued that infringement should be 1 ' 
limited to the literal scope of the patent claim. 

The majority opinion delivered by Justice Curtis held that the principle and 
mode of operation established both patentability and the scope of a claim. 126 The 
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»w Russell v. Cowley. 1 Web. Pat. Cases. 4 57. "This was the case of a patent for welding iron 
tubes, by drawing them, at a welding heat, through a conical hole. The infringement was the 
passing them between rollers; and the question of colorable or substantial difference was referred to 
the jury." Winans. 56 U.S. at 335. 

119 Morgan v. Seaward. I Web. Pat. Cases. 1 67, "Therefore, the two machines were alike in 
principle; one man was the first inventor of the principle, and the other has adopted it; and though 
he may have carried it into effect by substituting one mechanical equivalent for another, still you 
(the jury) are to look to the substance, and not the mere form, and if it is in substance an 
infringement, you ought to find so." Winans. 56 U.S. at 335 {quoting Morgan). 

™ Winans. 56 U.S. at- 336. 

122 Id. 

123 liL at -332. Counsel for the plaintiff went on to state: 

Still the question must always be, whether, whatever the shape he adopts,, he is 
not availing himself of the principle first suggested by the patentee; a question 
which, in a court of law, is at all times a question not for the court, but the jury; 
after the former shall have given to the specification that construction which is to 
govern the latter in determining whether the infringement complained of falls, 
substantially, in principle and mode of operation, within the plaintiffs patent. 
Id at 336. 

m Id at -337--38. Defendant's counsel stated: 

But the claim is confined to a single form, and only through and by that form to 
the principles which it embodies; and if, out of the many forms embodying more or 
less perfectly the same mode of operation, the plaintiff in error has made bis 
choice of the best, he is confined to that choice and the rejection which it involves 
of all other forms less felicitous .... Where the invention consists of a principle 
embodied in a single form, the form is the principle and the principle the form, 
and there can be no violation of the principle without the use of the form. 
Id. (citations omitted). 

125 Id at 341 . The majority wrote: 
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majority cited Da vi s v. PaJine.% as an instance where the patent is limited to > _ the 
particular form described. and. dairaeH/.^ 

cases, not on the basis that the inventor was limited in the scope of protection to 
what literally was described and claimed, but rather that the form disclosed by the 
patentee was the only form "capable of embodying the invention/' 128 The court held 
that the jury must decide infringement according to whether the defendant's claims 
were "the same in kind, and effected by the employment of his [the plaintiffs] mode 

of operation in substance.", 129 

The dissenting opinion also relied on Davi s v. Palmer , but argued that the 
specification and claim determine the limit of an invention. 130 This position was 
grounded in the concern that failure to confine a patentee's right to the literal bounds^ 
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Under our law a patent cannot be granted merely fo* a .change. .of .form.,. .....v... . 

iTlo change the form of an existing machine, and by means. of. s.uch .change. &>____. 

introduce and employ other mechanical principles or natural powers, or, as it is 
termed, a new mode of operation, and thus attain a new and useful result, is the 
subject of a patent. Such is the basis on which the plaintiffs patent rests. 
Its substance is a new mode of operation, by means of which a new result is 
obtained. It is this ne w*. mode, .of .operatip.n. which, gives, .it .the .character. P_f. .an . . . . 

. inYe.nti<7it;a^AentiUe8:thfi: investor: tea: patent ftnri*Jus:new : ipp4ftpf:op.erajHip.n:isc-.;:; 
in view of the patent law, the thingiiptitledJtp.pTO^JCUon , , 
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Now, while it is undoubtedly true, that the patentee may so restrict his claim as 
to cover less than what he invented, or may limit it to one particular form of 
machine, excluding all other forms, though they also embody his invention, yet 

such an interpretation should not be put up or _hk .claim, if .it. can. .fajrly_.be. 

a., jr. .cjjnstruei.QthQrwise. .aad.tbis.for. two.reasonQl 

1. Because .the .reasonable, p.res.ump itjbn. is. .that,, having a just right .to. coyer, and 

protect his whole invention, he intended to do so. T . 

2. Because specifications are to be construed liberally, in accordance with the 
design of the Constitution and the patent laws of the United States, to promote 
the progress of useful arts, and allow inventors to retain to their own use, not 
anything which is matter of common right, but what they themselves have 
created.! 

Davis v. Palmer. 7 F.Cas. 154. 157 (CCD. Va. 1827). 

128 Winons. 56 U.S. at 343 , In the majority opinion. Justice Curtis wrote : 

Undoubtedly, there may be cases in which the lettersipatent do include only 
the particular form described and claimed. Davis v. P aimer seems to have been 
one of those cases. But they are in entire accordance with what is above stated. 
The reason why such a patent covers only one geometrical form, is not that the 
patentee has described and claimed that form only; it is because that form only is 
capable of embodying his invention! and, consequently, if the form is not copied, 
the invention is not used. 
Id (citation.'? omitted). 

129 Id. at 344 (Campbell. J. dissenting). Justice Campbell wrote for the dissent' 

It must be the same in kind, and effected by the employment of his mode of 
operation in substance. Whether, in point of fact, the defendants' cars did copy 
the plaintiffs invention, in the sense above explained, is a question for the jury, 
and the court below erred in not leaving that question to them upon the evidence 
in the case, which tended to prove the affirmative. 

M 

130 Id. at 345 ("We are authorized to conclude, that his precise and definite specification 
claim were designed to ascertain exactly the limits of his invention." ^citations omitted)) . 
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limiting patent construction was foreseen and provided for, according to tie dissent, l\Vv. \r 
by the requirement that the invention be described in enabling terms and, || 
separately, that the inventor particularly specify and point out what the inventor 
claims as his invention^ 
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This danger was foreseen, and provided for, in the patent act. The patentee 
is obliged, by law, to describe his invention, in such full, clear, and exact 
terms, that from the description, the invention may be constructed and 

*■ used.- - - its- -principle- -and -modes- -of -operation- must -be- -explained-;- -and - the — 

invention [sic] shall particularly""s"pe"Cify "and p^ 

his invention. Fullness, clearness, exactness, preciseness, and particularity, 
in the description of the invention, its principle, and of the matter claimed 
to be invented, will alone fulfill the demands of Congress or the wants of the 
country. 132 

The dissent concluded that, "Ii]n this .case, the _ laiiguaj^e 9f _the_ JP&teQ^ . i?L _¥V^¥l». . 
clear A and exact. The claim is particular and specific/' 133 in effect, the dissent 
asserted that the statutory requirements (Patent Act of 1836) for the specification 
also defined the scope of protection : the specification must include a written 
description in such full, clear and exact terms as to enable a person skilled in the art 
to practice the invention; in the case of a machine, the "principle and the several 
modes" contemplated were to be explained; and the inventor was to particularly 
"specify and point out" what he claimed to be his invention. 134 

The Supreme Court case of White v. Dunbar* 35 was an infringement suit decided 
following enactment of the Patent Act of 1870, which explicitly required inclusion of 



131 Id. at 34 7. In Ms dissent. Justice Campbell stated- 

The claim of tcrday is, that an octagonal car is an infringement of this 
patent. Will this be the limit to that claim? Who can tell the bounds within which 
the mechanical industry of the country may freely exert itself? What restraints 
does this patent impose in this branch of mechanical art? .. 
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'« See Ant of 1836. ch. 357. S 6. 5 Stat. 117 (July 4. 1836) (currently codified at 35 U.S.C. § 112 
( 2 000) > . Th i s sect io n re ads inoertinentp art : 

[H]e shall deliver a written description of his invention or discovery, and of the 
manner and process of making, constructing, using, and compounding the same, 
in such full, clear, concise, and exact terms ... as to enable any person skilled in 
the art or science to which it appertains, or with which it is most nearly 
connected, to make, construct, compound, and use the samei and in case of any 
machine, he shall fully explain the principle and the several modes in which he 
has contemplated the application of that principle or character by which it may be 
distinguished from other inventions! and shall particularly specify and point out 
the part, improvement, or combination, which he claims as his own invention or 
discovery. 



id 



•a* White v. Dunbar , 119 U.S. 47 (1886). 
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claims. 186 The Court held that, in a reissued patents substituting the limitation of 
"textile fabric" with "enveloping material" between a metal can and shrimp contained 
within .the. can, .the. xeiasued.claim .was _hroadene_d impermissibly, and that, . therefore^ 
it was unlawfully granted. 137 SpecifeaUy,- the -Court -found -that there -was -nothing- -in--, 
the descriptiimto-supportteoacte 

by limiting the claim to a lining of textile fabric, the patentees were declaring that 
they claimed nothing more. 139 

The Court stated that, although the specification may be referred to in order to 
understand -the- meaning- of - a -claim v it is -no t -to- be- -used - to -change -the- claim M$ -The- -.. 
policy behind refusing- to -interpret *a- claim- beyond- its -literal -scope; as- expressed by;* 
the Court, was to force the patentee to explicitly define his invention in fairness to 
the public. 141 

Neither in the case of Winans nor in White was there literal support for the 
claim scope required to recover for infringement by a competitor.^ However, 
Winan s was decided prior to the Patent Act of 1S70. 143 ghe .Court in that case based 
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136 See., e.g., Markman v«.We.atYiew.Inatomentfi,.IncM-517.U.S.. 3.7.0.(l9S6).C , .Claixnpractice.dicL. 
not achieve. 8tatuto.rjr.recQgnitJip.n.untU .the. passage, jtf .the. Act. oil J.ujy. JL83.8 .and Jncl.usion .of .e.claim.^ 
did not become a statutory requirement until 1870, Act of July 8 " (citat ions omitted) ). 

137 White, 119 U.S. at 52 ("In our judgment the reissued patent in this case was unlawfully 
granted, and the bill should have been dismissed."). 

r . . >?? M at 4 9. Justice Bradley wrote for the majority:, 

*.. A Jhe.da;m_in_the._Qri 

. its pqntej&s; .whilst.in. th§. reissue. 1%M .fcr.inteXJ3P.8jwg. between .the m_§ta_l_ can ana! J 

the^ shrimps iin enveloping .material .for the shrimp.s.. .This, is.certainlyj .on its. face,. '■ 

a very important enlargement of the claim; and we see nothing in the context of \ 
the specification in the original patent which could possibly give the claim so 
broad a construction. 



Id. 
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'» Id. at 51 . The Court further noted: 

We see nothing in all this to raise the slightest implication that the 
patentees were the inventors of the process of interposing any and every kind of 
lining between the cans and their contents; and when their claim is confined to a 
fining of textile fabric, it is tantamount to a declaration that they claimed nothing 
else. 



140 Id. In support of their findings, the Court looked at : .,. 

Some persons seem to suppose that a claim in a patent is like a nose of wax 
which may be turned and twisted in any direction, by merely referring to the 
specification, so as to make it include something more than, or something 
different from, what its words express. The context may, undoubtedly, be 
resorted to, and often is resorted to, for the purpose of better understanding the 
meaning of the claim; but not for the purpose of changing it, and making it 
different from what it is. 
Id. at 51— -52. 

Id. at 52 . The court re-established that: 

The claim is a statutory requirement, prescribed for the very purpose of making 
the patentee define precisely what his invention is; and it is unjust to the public, 
as well as an evasion of the law, to construe it in a manner different from the 
plain import of its terms. 
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Id. 



Compare Winans v. Adam. 56 U.S. (15 How.) 330 (1854). with White v. Dunbar. 119 U.S. 47 



(188G). 

1 ! :; See generally Winans v. Adam. 56 U.S. 330 (1854). 
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its holding on the "mode of operation" of the patentee's invention as compared to that 
of a^endanfc'&.mjac^ 

mentioned, was decided after "enactment of the In contrast to — - — - 

Winans, the Court in White made no attempt to derive any principle or mode of U\&\ f Formatted: Font: Century, 8.5 pt 
operation from the specification and relied solely on the literal language of the Formatted: Font: Century 

specification- and' original claimy essentially ■ adopting" the " position - of the " dissent in' R ,! 
Winans^ by regjliraag ; not only [that [the patentee rjP!4^4^4! r X?!^?ci^ and ^poiiit* "6ut4 
what he claims as his invention," but that he "describe his invention, in^such_fuU^| 
clear, concise, and exact terms, that from the description, the invention may be i 
constructed and used."^ | 

Graver Tank v. Linde Air Products Ca^held^ that the 
silicates of calcium and manganese, infringed a pafen^'cTaiin^ 

alkaline earth metal silicate and calcium fluoride. 160 In support of it^hpjchng^the i^R^i 
Court argued that, a .pa tentea .should .be .protected -from, "the .unscrupulous .copyist. 'I 1 . 5 - 1 . !\\ | |$j \ 
The doctrine relied- upon- by - the-eourt- was -based- on - Winans -andj-acoording -to -the 
Court, could be invoked to enforce a patent against a device "if it performs ^ 
substantially the same function in substantially the same way to obtain the same 
result." 152 

The factors to be considered in determining whether a device was equivalent to 
claimed subject matter included, according to the Court in Graver Tank, knowledge 
of the "interchangability" of ingredients by those "reasonably skilled in the art." 153 
Given that manganese and magnesium were considered to be equivalent as 
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Id. at 344. 

See generally White v. Dunbar, 119 U.S. 4? (1886). 

' Id. at 52. See also Winans. 56 U.S. at 347 (Campbell J., dissenting). 

'so Graver Tank v. Linde Air Prods. Co.. 339 U.S. 605. 61.0 (1950). 

151 Id. at 607 . For the majority. Justice Jackson wrote' 

But courts have also recognized that to permit imitation of a patented 
invention which does not copy every literal detail would be to convert the 
protection of the patent grant into a hollow and useless thing. Such a limitation 
would leave room foriindeed encourage^the unscrupulous copyist to make 
unimportant and insubstantial changes and substitutions in the patent which, 
though adding nothing, would be enough to take the copied matter outside the 
claim, and hence outside the reach of law. 
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id 



152 Id. at 608 . The court further reasoned that: 

The essence of the doctrine [of equivalents] is that one may not practice a fraud on 
a patent. Originating almost a century ago in the case of Winans v. Denmead^ it 
has been consistently applied by this Court in the lower federal courts, and it 
continues today to be ready and available for utilization when the proper 
circumstances for its application arise. To temper unsparing logic and prevent an 
infringer from stealing the benefit of an invention'_a patentee may invoke this 
doctrine to proceed against the producer of a device 'if it performs substantially 
the same function in substantially the same way to obtain the same result.' 
Id. (citations omitted). 

153 Id. at 609 ( M An important factor is whether persons reasonably skilled in the art would have 
known of the interchangability of an ingredient not contained in the patent with one that was."). 
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components of welding flux, the Court found tfi'at'f^ur^'to'prGVi^ 

indication of independent research^erred "imitation^ the defendant. 1 ^ ^ 

In their dissenting opinions, Justices Black and Douglas criticized the court for '*» 
departing from the precedent of Whiter They also stated that application of a 
doctrine of equivalents unjustly deprives the public of notice and "emasculates" the 
portio-n"of th'd -sm^ 

and distinctly claonT the part j improvement, or combination which he claims is his"* 
invention or discovery," and that Congress explicitly provided for errors made by*\ 
patentees, such as not claiming- -the-full-hreadth -of--an-invention--to-which,.he-is-,' 
entitled, by reissue. 156 Other than by reissue, according to the dissent, Congress y 
entrusted the U.S. Paten t and Trademark Office ("USPTO") . rather than the courts, ' 
to determine whether claim scope should be expanded. 167 

TJie majority decision of & 
claim scope was determined by equivalence -of- -the- -mode- --or-- -principle-- of- -the- v 
invention, 168 despite the T explicit statutory- requirement imposed by the -Patent- Act of V 

claim the part, "imprbvemerit, or combing or V 

discovery," and despite the holding in White, this jmits claim t scope to the explicit \ 
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iw Id. at 612. The court noted: 

Specialists familiar with the problems of welding compositions understood that 
manganese was equivalent to and could be substituted for magnesium in the 
composition of the patented flux and their observations were confirmed by the 
literature of chemistry. Without some explanation or indication that Lincoln weld 
was developed by independent research, the trial court could properly infer that 
the accused flux is the result of imitation rather than experimentation or 
invention. 



;Af Deleted:' 



( Deleted: ' 



Deleted: 1 



vfc f Deleted: ' 



Deleted: at _,."). 



M. 



165 Id. at 614 (Black and Douglas, J.J., dissenting) . Justices Black and Douglas further noted'- 
Today the Court . . . departs from the underlying principle which, as the Court 
pointed out in White v. Dunbar,JbrbidB treating a patent claim like a nose of wax 
which may be turned and twisted in any direction, by merely referring to the 
specification, so as to make it include something more than, or something 
different from, what its words express .... 

r Id (citations omitted). - . r 

166 Id. at 614--15 . Justices Black and Douglas further discussed the emasculation of the 
statute: 

In seeking to justify its emasculation of R.S. § 4888 [35 U.S.C. § 33, 
providing that an applicant "shall particularly point out and distinctly claim the 
part, improvement, or combination which he claims as his invention or discovery!) 
by parading potential hardships which literal enforcement might conceivably 
impose on patentees who had for some reason failed to claim complete protection 
for their discoveries, the Court fails even to mention the program for alleviation of 
such hardships which Congress itself has provided. 35 U.S.C. § 64 authorizes 
reissue of patents where a patent is [[wholly or partly inoperative! due to certain 
errors arising from "inadvertence, accident, or mistake! of the patentee. 

M 

157 Id. at 615 ("Congress was careful to hedge the privilege of reissue by exacting conditions. It 
also entrusted the Patent Office, not the courts, with initial authority to determine whether 
expansion of a claim was justified."). 

'58 See eenerallvW'man* v. Adnm. 56 U.S. (15 How.) 330. 336 (1854). 

i M Id. 
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embodiments of the specification.!^ 
Graver Tank because, the accused 



The Court found equivalence in the case of 
product performed "substantially the same" 



function in substantially the same way to obtain the same result," and stated that 
the differences between-the^atent^laims^n 
only. 161 

En gineerjn^De\& v. Radio Corp: 'df~~Asikri^^ 

appeal of an infringement suit decided in 1946. prior to draver Tank. In Engineering 
De velopment Laboratories. Judge Learned Hand, for the U.S. Court of Appeals forj l 
the Second Circuit, addressecLthe-^estion^f-w^etiier-daim-amendinen 
reissue application constituted an impermissible broadening of the invention.^ The 
reasoning of the court was very much like £he-TnMngement 'test that would be" 
announced jn Graver Tank (i.e., whether the device "performs substantially the same 
function, in substantially t he same way to obtain the same result"). 164 Accordin g to 
the court, "any- .patent- -is -entitled- -to -some- -range -of - equivalents?- -and- -the- -test-ibr- 
amended claim language was whether the scope would "produce substantially the ay 
same result by subst antia lly the same means." 166 Th e underlying policy for allowing % 
such amendments "fo claims was to prevent depriving inventors of protection tor l|\ 
embodiments of their invention "they never meant to put i n the public demesne," U\\ 
despite competing interests, such as intervening rights in the case of reissue. 166 
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Id 

>6i Graver Tank v. Linde Air Prods. Co., 



339 U.S. 605. 612 (1950) ("Though the infringement 



was not literal, the changes which avoid literal infringement are colorable only."). 
162 Eng. Dev. Labs, v. Radio Com of Am.. 153 F.2d 523 (2d Cir. 1946). 
"» Id. at 524. 

164 Compare Eng. Dev. Labs, v. Radio Corp. of Am.. 153 F.2d 523 (2d Cir. 1946). with Graver 
Tank v. Linde Air Prods. Co.. 339 U.S. 605 (1950). 

165 Ens. Dev. Labs.. 153 F.2d at 524-25.- The court wrote: 

So far as concerns the change of the anode of the ^rectifying! tube, we are 
unable to say on this record that it enlarged the scope of the claims at all, if 
proper allowance be made for possible equivalents. As we have said, the !grid! 
had no function; a "rectifier'! which left it out would accomplish the same result in 
substantially the same way* and any patent is entitled to some range of 
equivalents. 

As to the amendment which substituted !resistance! for Variable 
resistance! in the heater circuit, the disclosure was of a !series of resistance! in 
that circuit, and the text did not prescribe that any of them should be variable, 
but only, as we have said, that !one of these resistances may be a variable 

resistance if desired! But once more, we cannot know whether a !resistance,! 

properly designed for a heater circuit, does not produce substantially the same 
result by substantially the same means as a Variable resistance! so set that it 
will prevent the burning out of the filaments. A priori we should suppose that it 
did. 

Id. at 524-25 (emphasis added) (citations omitted). 
lfi 6 Jd. at 526 . The court further noted: 

Nor need we say that, though no excuse is given for the delay [in amending 
the claim], the doctrine (of intervening rights] applies if the change is not to a 
!new invention! but involves only a minor change necessary to secure complete 
protection for what the applicant originally intended to reserve to himself 
Possibly he may have as much in spite of ^intervening rights.! The doctrine [of 
intervening rights] is designed to protect the public against abuses, not to deprive 
inventors of what they plainly never meant to put into the public demesne. It is 
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Therefore, just as in cases, extending as far back as Odiorne, where infringement and 
patentability were based on identification of substantial similarity to the "principle" 
or "mode of operation" of an invention,, in En gneeriaz P& j^lpPfp^^^^^A?^?.^ 
support for amendment of a claim was determined by identifying substantial 
similarity of the amended claim language to embodiments found in the written 
descriptiorras-filed;£a 



C. Interpretation of 35 U.S.C. § 112, First Paragraph, Under the Patent Act of 

1952 

l._Advent of the Modern Written Description Requirement ' "Equivalence, " 
'Invention" and 'Possession by the Inventor" 

Prutton v. JfaZfejt^ which was derided ajter_e^ 
was an appeal from an interference proceeding wherein entitlement to earUer^filed 
applications was determined under a general requirement of "sufficiency of 
disclosure." 170 Specifically, the question was whether earlier^flled apjpHcatiqns would 
"fairly suggest" a claimed composition to one skilled in the art. 171 The court stated 
that broad disclosure by an applicant does not necessarily entitle him to claim any 
specific combination of disclosed elements' 

It is clear, however, that when an applicant cites two or more lists of 
ingredients and indicates that any one in one list may be combined with 
any one in another, he is not necessarily entitled to claim any specific 
combination of elements which may fall within the scope of such a 

disclosure, and we have so held. £s -jyas \$?SA A? : *h® .4^?ip-?--9?..^®-h?.^T4 

affirmed in the JPrattoja ca^ 

of a broad field to be usurped to support claims to a composition of 
matter^ 72 

Injnre Gajr^ Judge .toph^ 
a decision by the Pate^rOffice" ^ 



enough here to say that it certainly does not prevent amendments which go no 
further than to make express what would have been regarded as an equivalent of 
an original; or to incorporate into one claim what was to be gathered from the 
perusal of all, if read together. This is the situation here as it comes to us upon 
this record. 




lg Ene. Dev. Labs.. 153 F.2d at 526-27. 
»™ Prutton v. Fuller. 230 F.2d 459. 463 (C.C.P.A. 1956) ("The issue thus presented by this 
appeal is limited to the sufficiency of the disclosure of the earlier Prutton applications."). 

171 Id. at 463 ("The detennining factor is whether the application would fairly suggest to the 
skilled worker in the art the particular composition claimed, or whether the desirability of that 
composition could be ascertained only by extensive experimentation."). 

172 Id. (citations omitted). 

™ In reGav. 309 F.2d 769. 774 (C.C.P.A. 1962). 
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claims as being based on "new matter" because the phrase "substantially nonporous" 
was added to the specification and the claims after filing 176 and because applicants 
failed to disclose the "best mode" required under 35 U.S.C. § 112. 176 The court stated 
that, because "appellants' specification would have indicated to one skilled in the art 
that all suggested container materials were to be substantially nonporous . . . 
insertion of this limitation expressly into the specification and claims did not involve 
'new matter."* 177 The court also found that appellants met the requirement of 
disclosing the "best mode" of the invention. 178 

The court stated that the first paragraph of 35 U.S.C. § 112, although including 
several requirements, has two distinct parts- 

We have set forth the Patent Office position in some detail as we feel 
that it confuses, and in fact is in part contrary to, two of the several 
requirements of the first paragraph of 35 U.S.C. § 112. This paragraph 
reads as follows: 

"[A] The specification shall contain a written description of the 
invention, and of the manner and process of making and using it, in such 
full, clear, concise, and exact terms as to enable any person skilled in the 
art to which it pertains, or with which it it [sic] most nearly connected, to 



mal^e and use the same, and shall {Deleted: d 

"[B] set forth the best mode contemplated by the inventor of carrying 



out his invention." 17 .? _,.- - - { Deleted: [Emphasis ours.] 

The court then stated, with respect to the first part, [A], of the first paragraph 
that, in essence, the statute requires that a specification enable one skilled in the art 
to make and use the invention: 'The essence of portion [A] is that a specification shall 
disclose an invention in such a manner as will enable one skilled in the art to make 
and utilize it." 180 

The court also commented on 37 C.F.R. § 1.71(b), with which both the patent 
examiner and the Jfoarj^stated .that the ^ Deleted: board 

1.71(b) read at the time of Gay, as it does now, as follows: 

The specification must set forth the precise invention for which a *■ 
patent is solicited, in such manner as to distinguish it from other inventions 
and from what is old. It must describe completely a specific embodiment of 
the process, machine, manufacture, composition of matter or improvement 
invented, and must explain the mode of operation or principle whenever 



Id. at 770. 
'76 id at 772. 
'77 Id at 771. 

'78 Id. at 774 ("[I]t is manifest that appellant does not consider either perforation size, 
positioning, or number to be particularly crucial aspects of his invention, and that this fact would be 
appreciated by one skilled in the art who read the specification."). 

'to Id. at 772 (emphasis added) . 

ix>fd 
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applicable. The best mode contemplated by the inventor of carrying out his 
invention must be set forth. 181 



The court distinguished the requirement of 
statutory requirement of 35 U.S.C. § 112, as follows- 



37 C.F.R. § 1.71(b) from 



One final point remains to be discussed ^rthe Patent Office requirement 
based on Rule 71(b) that a "specific embodiment" of appellants invention be 
described in the specification. No direct statutory basis exists for this 
requirement other than portion [A] of section 112, which it appears to 

implement The word "specific" is a somewhat indefinite term in that it 

involves a matter of degre e^the question, How specific?, is not answered. 



Obviously, it is not necessary that an application be more specific than is 
required by section 112, portion [A], Not every last detail is to be described, 
else patent specifications would turn into production specifications, which 
they were never intended to be. 182 

The court held, with respect to the claimed device: 



[Tlhe disclosure in the specification is such that undue experimentation 
would not be necessary for one skilled in the art and that this disclosure 
would be sufficient to enable him to make and use the instant invention; 
an d . . . . that in the instant case appellant's disclosure in his specification, 
taken with his disclosing in the drawing, amounts to a disclosure of a 
specific embodiment of his invention. 183 

Therefore, like the court in En g^eerm^ De velopment Laboratories, which relied 
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on identification of equivalence between an original specification and amended claim 
language, 184 the court in Gay found a description of a "mode of operation or principle" 
in the patent disclosure sufficient to enable one skilled in the art to make and use the 
invention as claimed.-^ 



In In re Bainer^the ^court held that claims directed to a process for cros s^linkin g „ Deleted: 
ethylene were not entitled to the filing date of an earlier^filed application. i£I . 



polyethyle 

Like Prut ton, the basis for the court's holding was that, although all of the materials 
recited in the claims were listed in the earlie r^filed application, there was no 
disclosure in the earlier specification to guide 
thereby arrive at the claimed invention. 188 



selection of those materials and 
The court in Rainer distinguished the 
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■» l 37 C.F.R. § 1.71(b) (2004) (emphasis added). 

182 0:i\: 309 F. 2d at 774 (emphasis added). 

183 Id 

■ M See supra Section IH.B.2. 

If Q;iw m) F.2dat773. 

In r v Rainer. 347 F.2d 574. 578 (C.C.P.A. 1965). 

I8H Id. at 577 . The court held that: 

True, one may, by blind, unguided selection of the claimed materials from the 
some 53 listed materials, ultimately arrive at the claimed invention, but there is 
nothing disclosed in the specification by which the skilled person in this art will 
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requirement of Rule 7l[b] that the specification "set forth the precise invention" from 
the literal language of 35 U.S.C. § 112: 



- Where; - -however,- - the - process- • claims- - are- - directed - to - -the- -use - of - -specific - - 
materials in the process and the specification discloses nothing to guide 
such a person in making the selection of such specific materials from the 
rather extensive catalog of materials recited, we think the spirit if not the 
actual provisions of section 112 are not met. Certainly, the specification 
here does not "set forth the precise invention for which a patent is solicited, " 
as required by Rule 71 fbj. 189 
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Therefore, whereas the court in Gay considered Rule 7l[b] not to exceed the 
literal requirement of the first paragraph of 35 U.S.C. § 112, of enabling the claimed 

invention^ the court in Rainer suggested that the literal language of Rule 71 [b] , .... - { Deleted: 180 

requiring the specification to "set forth the precise invention," may exceed the literal — — — __ 

meaning of 35 U.S.C. S 1 12.1*1 

The Court of Customs and Patent Appeals in In re Ruschig 192 affirmed a decision 
by the £0 ar ^rejecting a claim (clai m 13) directed to 1 a .specific compound, ... - - { Deleted: Patent Office 
^(&cldoTObenre : 
a patent examiner for the purpose of an interference proceeding that later was fc.. 
dissolved by the examiner on his own motion. 194 In the interim, divisional $ 
applications were filed by the parties to the interference that included, in one of the 
divisional applications, claims 3 and 7, specifying the same compound as that in 
claim 1 3 of the earlier^filed .application. I 96 Patentability ;_of the subject ^matter of 
claims 3 and 7 in the divisional application was contingent upon successful reliance \ 
on the earlier filing date. 196 The ^Boar d held that the claims in the di visional 
application were not patentable because the specified compound was not disclosed in \ 
the patent application. 197 On appeal, the court stated the issue to be as follows- 

The sole issue on this appeal is whether claim 13 [of the parent 
application! is supported by the disclosure of appellants 1 application, a 
question which had not been raised in this case at the time of the prior 
appeal [reversing a rejection of twelve claims of the same application based 
on prior art]. 198 
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be guided in making theBe particular selections in his efforts to practice the 
invention. 

189 Id. (emphasis added). 

Rainer. 347 F.2d at 577. 
■92 In ro Ruschig. 379 F.2d 990 (C.C.P.A. 1967). 
■93 Id. at 991. 
i*Id 
™Id. 
mid. 
™ Id. 
'98 I£ 



The court held that the parent application did not explicitly disclose the claimed 
compound, nor did it provide sufficient guidance to lead one skilled in the art to the 
claimed compound^^Specificall^ , as stated by the court: 



But looking at the problem, as we must, from the standpoint of one with no * 
foreknowledge of the specific compound, it is our considered opinion that the 
board was correct in saying: 

Not having been specifically named or mentioned in any manner, one is 
left to selection from the myriads of possibilities encompassed by the broad 
disclosure, with no guide indicating or directing that this particular 
selection should be made rather than any of the many others which could 
also be made. 200 

Appellants argued that the written description of the class of compounds that 
embraced the specific compound of claim 13 was sufficient because it would enable 
one skilled in the art to make the claimed compound. 201 The court argued, in 
response, that the question was not whether one motivated to make the specific 
compound would be enabled by the specification to do so, but whether appellants had, 
in fact, invented the specific compound claimed: 

We find the argument unpersuasive for two reasons. First, it presumes * 
some motivation for wanting to make the compound in preference to others. 
While we have no doubt a person so motivated would be enabled by the 
specification to make it, this is beside the point for the question is not 
whether he would be so enabled but whether the specification discloses the 
compound to him, specifically, as something appellants actually invented. 
We think it does not. 202 

The basis for stating that the compound of claim 13 was not "something 
appellants actually invented" was the court's finding that the specification failed to 
provide sufficient description to enable one skilled in the art to select the compound 
from the "myriad" of possibilities of the broad disclosure. 223 The court stated, in 
other words, that identification of each of the variables from which selection is to be 
made is not sufficient, without guidance, to support any particular combination of 
those variables: 

It is an old custom in the woods to mark trails by making blaze marks on * 
the trees. It is no help in finding a trail or in finding one's way through the 

woods where the trails have disappeared^or have _nqt jyet been_made t w^ch 

is more like the case hereto > be confronted s^j9ly__^__a _^ge ^ number 
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unmarked trees. Appellants are pointing to trees. We are looking for blaze 
marks which single out particular trees. We see none. 204 
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Although the specification disclosed a class that included the specific compound, 

the court found that there was insufficient guidance in the specification to lead one xy„v> — 

skilled in the art from the class to the specific compound daimed^^erefore^the^ Inserted: (emphasis added) 
"invention" found lacking by the court was not an invention that the broad disclosure 
failed to embrace, but, instead, a selection not conveyed to one skilled in the art. As 
stated by the court: 



|$ f Deleted: 379 F.2d at 996. 
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Second, we doubt that the rejection is truly based on section 112, at least on 
the parts relied on by appellants. If based on section 112, it is on the 
requirement thereof that 'The specification shall contain a written 
description of the invention * * * ." We have a spec^cation which 
appellants' invention. The issue here is in no wise a question of its 
compliance with § 112, it is a question of fact- Is the compound of claim 13 
described therein? Does the specification convey clearly to those skilled in 
the art, to whom it is addressed, in any way, the information that 
appellants invented that specific compound? Having considered the 
specification in the light that has been shed on it by all the arguments pro 
and con, we conclude that it does not. 206 
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The court in Ruschig therefore, resolved the discrepancy between Gay and 
Rainer,hy holding that a specification lacking ^adequate guidance to 
skilled in the art that "appellants invented" the claimed invention did not provide a 
"written description of the invention," as required by ,35 U . S. , C.Jj _112.jgg 

After Ruschig, a distinction between an "enablement requirement" and a 
"description requirement" within the first paragraph of 35 U.S.C. § 112 was explicitly 
recognized. For example, as stated in Jn re DUeone n the i.. court _steted_how__Ae_ 
"enablement" requirement might be met without satisfying the "written description 
requirement"- 

For greater clarity on this point, consider the case where the 
specification discussed only compound- A and contains no broadening 
language of any kind. This might very well enable one skilled in the art to 
make and use compounds B and C; yet the class consisting of A, B and C 
has not been described. The first paragraph of § 112 requires both 
description and enablement. 210 
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2w Id. at 994=195. 

Id. at 995-96 (emphasis added), 
206 Id. (emphasis added) . 

»» In re Ruschig. 379 F.2d 990. 99S (C.C.P.A. 1967), 

210 hi re Dileone. 436 F.2d 1404. 1 -105 n. 1 tC.C.P.A. 1971) (emphasis added) . 
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JBoard decision as [freing" * ^narro wed'' \ ~io " a" rt ~descrip^ re^^ement" * o f jjie "ffrsW 
paragraph of 35 UVS.C. § ilV* 1 ^ The court relied on ^ 

decided as whether the appellants "invented" the claimed subject matter. 21 ^ The 
court found that, although the specification and original- claims -taught that- the--, 
"segmentizing medium" is "air or other gas which is inert to the liquid," it would \ 
"naturally occur'' to one skilled in the art] to empTojr an*"mert ^flxud* as later claxmecf, ^ 
re gar die ss of wheiher. the .fluid w.as . a liquid .ox .a .gas. ?ii . .Upon. the. .facts, of. the. .c ase^ 
the court concluded that "applicants invented a sample analyzer with an inert fluid*^ 
segmentizing medium." 215 

T The court in Gay construed the first .paragraph qf 35 U.S. C. § 1 12 to include only 
two parts, that of enabling the claimed invention and of providing the best mode 
contemplated by the inventor of carrying out the inventio n,. 2 1G Ho we yer, . the. ■Court ±dl 
Smythe- -held - that- a- "de scrip tion"- -requirement- under- - the -first - paragraph - was - met -\ 
because the broader claim limitation of "fluid" would, in view of the teachings of the ' 
specification, "naturally occur to one skilled in the art reading the" description' bT the "\ 
use of air^or other gas as a segmentizwg ^ separate .the ^Hgmd^samjples."^**' 

The court in Smythe c^i^guished.the.pa^^^ 

skilled in the art would not necessarily predict performance of a general class in view \ 
of selected species or subcombinations disclosed; where there is such \ 
unpre dicta bility, one skilled in the art would not have been "found to have been 

"2 In 7-gSmvthe. 480 F.2d 1376. 1382 n.2 (C.C.P.A. 1973). The court reasoned that: 

The solicitor states that the ^Board's rationale makes it clear that it 
regarded 35 U.S.C. § 112, paragraph 1 as the proper statutory basis of its 
rejection,Z_and particularly argues that appellants rail to describe their invention 
in their specification. 2 

2. The board may have also treated the rejection of these claims under § 112 
under the 'enablement' section of the first paragraph, but the solicitor has 
narrowed the rejection by his argument to the 'description* requirement. 

Id. (emphasis added). 

218 Id. at 1332. The court further noted: 

The question which must be answered is whether the application originally filed 
in the Patent Office clearly conveyed in any way to those skilled in the art, to 
whom it is addressed, the information that appellants invented the analysis 
system with an inert fluid as the segmentizing medium. If it did, then appellants 
have made a written description of their invention within the meaning of the first 
paragraph of 35 U.S.C. § 112. 

Id. (citations omitted). 

2 " Id. at 138 3. The court held that; 

We believe that the use of an inert Quid broadly in this invention would 
naturally occur to one skilled in the art reading the description of the use of air or 
other gas as a segmentizing medium to separate the liquid samples. While fluid 
is a broader term, encompassing liquids, as noted by the solicitor, the specification 
clearly conveys to one skilled in the art that in this invention the characteristics 
of a fluid are what make the segmentizing medium work in this invention. 

Id. (emphasis added). 

215 Id. at 1384 ("Likewise, we find in the facts here a description of the use and function of the 
segmentizing medium which would convey to one skilled in the sampleianalysis art the knowledge 
that applicants invented a sample analyzer with an inert fluid segmentizing medium."). 

**° Id. at 1383. 

2,7 Id. (emphasis added). 
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broad language "that wo^.PM^^X.op^ HT.to .95 e . skilled : in the art," according to the_ 
court, would cause an extreme burden on applicants and the public : 
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The alternative places upon patent applicants, the Patent Office, and the 
pu1&nc"'the"uMue""bW 

^xgmining, in the case of the Patent Office, andjprinting and storing, in the 
case of foe*pubHc^.desOTptiona^ 

equivalents of disclosed elements or steps which are already stored in the 
minds of those skilled in the arts, ready for instant recall upon reading the 
descriptions of specific elements or steps. 219 

The Court of Customs and Patent Appeals in In re Wertheim^ stated that ."The . 
function of the description requirement is to ensure that the inventor had possession, 
as of the filing date of the application relied on, of the specific subject matter later 
claimed by Him . . . ."221 Thereafter, "possession by the inventor" was invoked as the \ 
threshold in several cases decided under the "written description requirement." For • 
example, the court in In re Blase^relied _on Wertheim to state ^ 

written description requirement is to "ensure that the applicant had possession, as of \ 
the fifing date of the application relied on, of the specific subject matter later claimed \ 
by him." 223 Also in 1977, the Court of Customs and Patent Appeals in In re DriscoH \ 
heard an appeal by appellants from a decision by thcgpjyrd .that a L claim .p^ected to a \ 
chemical compound was not supported in "full, clear, and exact terms," as is required \\ 
by the first paragraph of 35 U.S.C. §_112. 225 The question was whether the appealed V 
claim was entitled to the filing date of an earlier application, thereby antedating an 
intervening prior art reference.^ The earlier application included a "Markush 
group" of fourteen constituents, one of which was the support requked to entitie the 
appellant to the earlier filing date. 227 The court held that one skilled in the art would \ 

218 id 

This is not a case where there is any unpredictability such that appellants' 
description of air or other inert gas would not convey to one skilled in the art 
knowledge that appellants invented an analysis system with the fluid 
segmentizing medium. In other case s,-r particularly but not necessarily, chemical 
cases, where there is unpredictability in the performance of certain species or 
subcombinations other than those specifically enumerated, one skilled in the art 
may be found not to have been placed in possession of a genus or combination 
claimed at a later date in the prosecution of a patent application. 
Id. (citations omitted). 

219 Id. at 1384. 

221 In re Wertheim. 541 F.2d 257. 262 (C.C.P.A. 1976) . 

2 *3 In rgBlaser. 556 F.2d 534. 537 (C.C.P.A. 1977) (citations omitted) . 

225 i n rgDriscoll. 5G2 F.2d 1245. 1247 (C.C.P.A. 1977) . 

jgj Id. 

227 Id. at 1249 . The court noted: 

We thus agree with appellant that a skilled artisan would recognize from 
the disclosure of S.N. 782,756 fourteen distinct classes of compounds .... This 
being the case, it follows that S.N. 782,756 describes the subject matter of claim 

13 inasmuch as one of the fourteen classes of compounds is the chalky! / 

sulfonyl^l,3,^t^diazole .ureas defined therein. 
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rec6"^2e~thaty"as ^ 

invention claimed Because Tr 6 it as sucn from the 
earlier filed application." 228 The Driscoll court based public policy for the holding on 
a quotation from En gpieerinp De velop men t La tyratories v. Radio Corp. o f Am erica^ 
in which, the i court employed eq^ ' 



if,' "when fappBcaniaT yield" any part of what" ih'ey" onginaliy belie ved" ft " " 
be their due, they substitute. a.newJinyjm^ 

to 'them - they- -must - at - the-outset -either- prophetically- -divine- -what - the -art- - - - 
contains, or they must lay down a barrage of claims, starting with the 
widest and proceeding by the successive incorporation of more and more 
detail, until all combinations have been exhausted which can by any 
possibility succeed. The first is an impossible task; the second is a custom 
already more honored in the breach than in the observance, and its 
extension would only increase that surfeit of verbiage which has for long 
been the curse of patent practice, and has done much to discredit it. It is 
impossible to imagine any public purpose which it could serve" 23 ! 

In Vas^Cath^ Inc. v. Mah urk&TifaejQAFQ addressed whether a claim to priority 
under 35 U.S.C. § 120 to an earlier design application should have been denied 1 
because the drawings did not provide an adequate "written description" of the 
claimed invention as required by the first paragraph of 35 U.S.C. §_11^ 23 ^ The .court, 
recounted a history of the written description requirement that began by reciting the 
third section of the Patent Act of 1793, stating that the patent applicant must 
"deliver a written description of his invention . . . Z' 234 Objects that were set forth in 
Evans under the Patent Act of 1793 were, according to the Vas-Cath ..court, a _ 
"historical" explanation for "written description" and "definiteness" requirements" 
under the first and second paragraphs of 35 U.S.C. § 112, respectively. 235 The court 
then recited a "second, policy^base d rationale for the inclusion in section 11 2 of both 



228 Id. at 1248-49 . The court, stated that.: 

In resolving this issue, we must view the disclosure of the earlier filed 
application as would a person skilled in the art and determine whether it 
reasonably conveys the information that as of the filing date thereof appellant had 
possession of the class of 5;alkylsulfonyl;l,3,4:thiadiazole ureas defined in claim 
13. We are satisfied that it does 

[W]e believe that, in reality, the exemplified structural formula constitutes the 
essence of appellant's invention and that one skilled in the art would recognize it 
as such from the earlier filed application. 

Id. (emphasis added). 

231 DriscolL 562 F.2d at 1250 {quoting Eng. Dev. Labs. v. Radio Corp._of Am,, 153 F.2d 523, 
526-27 (2d I Cir. 1946)- (emphasis added in DriscolDl 

233 Vas-Cath Inc. v. Mahurkar. 935 F.2d 1555. 1559 (Fed. Cir. 1991). 
2W Id. at 1561 {quoting Evan s v. Eaton. 20 U.S. 356. 430 (1822)) . 
235 Vas-Cath. 935 F. 2d at_1560=:61. 
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[T]here 4s - a - subtie^ -relationship- -between- -the--policies-UBderlying-^e-- 
tfe^cription- and - definite ness- -require ments-.-aa- -the- - two^standards,- - while - - 

Adequate description of the invention guards against the inventors 
overreaching by insisting that he recount his invention in such detail that 
his future claims can be determined to be encompassed within his original 
creation. The definiteness requirement shapes the future conduct of 
persons other than the inventor, by insisting that they receive notice of the 
scope of the patented device. 237 
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As discussed, the court in Evans required a specification to provide a description 
that would prevent the inventor from "practicing upon the £ redulitv or the * fears of 
other persons, by pretending that^s invention 

The basis for the concern at that time was whether an inventor, who had improved \\ 
upon the prior art, was entitled to a patent on the improved device as a whole, or only 
on the portion of the device that represented the improvement. 239 Further, even at y 

the time of Evans, the principle of an invention could embrace later improvements, \\ m FormaV ^ : Font: Century, 8.5 pt 

as set forth in Section 2 of the Patent Act of 1793, separately from the "written \\ i§f== — — ' — 

description" requirement of Section 3 of that Ajct, 2 . 40 ... .The . court, in _ Va^Cath^ % g Formatted: Indent: Rrst line: 0.3 
acknowledged that preventing an inventor from overreaching was a "second, ^ |\ $ Formatted: Font: Century 
policy^base id rationale /'distinct from the object of Evans, of ^"taking .from the mventor \ X\ \ Formatted: Font: Century, 8.5 pt 
the means of practicing upon the credulity or fears of other persons by pretending \\ ^| Formatted; italic"" 



that his invention is more than it really is . . . ."^ Therefore, the "second, . \y ? ^ 

policy^based rationale," requiring a w^ \\ V> 1 Formatte d: Font: Bold 
inventor from "overreaching" by requiring that "he recount his invention in such 
detail that his future claims can be determined to be encompassed within his original 



creation," 242 was new I it did not originate in the Patent Act of 1793, at least in view of 
the objects of the Act as explained in Evans and recited in 
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238 id. at 1561. 

237 Id. (quoting Rengo Co. v. Molins Mach. Co., 657 F.2d 535 (3d Cir. 1981) . cert, denied, 454 
1055 (1981)). 

238 Eva he v. Eaton, 20 U.S. (7 Wheat.) 356. 43 4 (1622 ) . 
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239 Id, at 434- 35 . Tlie court reasoned that: 

The specification must describe the invention u Jn such full, clear, and distinct 
terms, as to distinguish the same from all other things before known. 1 How can 
that be a sufficient specification of an improvement in the machine, which does 
not distinguish what the improvement is, nor state in what it consists, nor for how 
far the invention extends? ... we do not say that the party is bound to describe 
the old machine, but we are of opinion that he ought to describe what his own 
improvement is, and to limit his patent to such improvement^ 



2<o See supra Section riJ,B.l. 

Vas Cath. inc. v. Mahurkar. 935 F.2d 1555. 1561 (Fed. Cir. 

242 Y£ 



1991). 
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In Lock wood v. American Airlines^ the fiAFC affirmed a district court decision 



th#t two of three intervening applications tailed to _ maintain cqntouity 



thereby denying entitlement to a necessary earlier filing date.2±i Judge Lourie stated 



that entitlement -to- -a- -fi^g-^te^oes-noUextend-to-subject-jn atter— which-is-noU 
disclosed, but would be obvious over what is expressly disclosed." 245 The court stated \ 
that a demonstration that* one" i^ 

provjdi^ig a description j)f the invention, not that which would make the invention 



obvious, 246 and that a written description demonstrates possession of the invention 

by "w^rds, StmctMrP«, figures, rliflgrrflmg, formula^ ptr , that, fully set fhrt.h t.hp 
claimed invention." 247 

The court then went on to state that, although the invention need not be 
described in exact terms, the description must be equivalent to the claimed subject 
matter: "{Although the exact terms need not be used in haec verba ... the 
specification must contain an equivalent description of the claimed subject matter^ 
description which renders obvious the invention for which an earlier filing date is \ 
sought is not sufficient." 249 \ 

Therefore, as in En gineering Dev elopment Laboratories. 251 ythich was decided in 
1946, prior to the Patent Act of 1952, adequate support in Lockwood was a literal 
description in a specification equivalent to claimed subject matter.^ Further, as 
was also true in earlier cases, the court in Lockwood held that speculation "as to 
modifications that the inventor might have envisioned" was not enough; the 
invention must be describe d-" 



It is not sufficient for purposes of the written description requirement of § 
112 that the disclosure, when combined with the knowledge in the art, 
would lead one to speculate as to modifications that the inventor might 
have envisioned, but failed to disclose. Each application in the chain must 
describe the claimed features. 254 



Lockwood v. An). Airlines. 107 F,3d 1565. 1576 (Fed. Cir. 1997). 
™Id.*t 1571-72. 
.. 246 id. at 1572, 

The question is not whether a claimed invention is an obvious variant of that 
which is disclosed in a specification. Rather, a prior application itself must 
describe an invention, and do so in sufficient detail that one skilled in the art can 
clearly conclude that the inventor invented the claimed invention as of the filing 
date sought .... One shows that one is 'in possession' of the invention by 
describing the invention, with all its claimed limitations, not that which makes it 
obvious, z 
Id. (emphasis added) (citations omitted). 
247 id, 

249 Id. (citations omitted). 

251 Eng. Dev. Lab s, v. Radio Corp. of Am.. 153 F.2d 523j2d Cir. 1946). 
-™ Lockwood. !07F.3d at 1570. 
2" M at 1572. 
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j n summary uiide r Loc&wopd and .its legal precedents^ m^ajteg^te^ymtoja 
description must enable one skilled in the art to understand, from the literal teaching 
or its .equivalent,. the. scope, .of .the. claimed Anve ation, * 
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In In re Fisher^the Court of Customs and Paten a rejection by ^\ 

the Board for insufficient disdpsure .under tie first para graph of 35 U.S.C. § 1 12. Sfifi A ^\\\ 
The court held that, despite failure to provide a structural description of porcine \\\;. 
adrenocorticotrophic hormones (ACTH), porcine (hog^extracts .^closed m ^e parent 
application provided adequate support under the first paragraph of 35 U.S.C. § 11^ 
for rejected claim 4 257 because they inherently included the sequence recited in the 
claim. 258 On the other hand, the court held that the claimed hormone, which was not 
limited to that of any particular animal, cotdd-not-be-hroadened-beyond--the-. 
thirty^nine amino acid sequence of ho^ituitary extracts .because .tftere- : was- -no- i 
cbtonitta^g'that : ^e's^ctuig'of ACTITs' oi'otieT'animffis 1 wotdd ber this sam e' Hffd ' j 
the specification would'iiot enable "one skilled "in the art "to Tr make~ of" obtain"* ACTHs 
having other amino acid sequences. 269 Therefore, in view of Fisher, a specification 
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In re Fisher. 427 F.2d 833. 840 (C.C.P.A. 1970). 

257 Id. atML_Claim 4 reads as follows: 

4. A n adrenocorticotrophic hormone preparation containing at least 1 
International Unit of ACTH per milligram and containing no more than 0.08 units 
of vasopressin and no more than 0.05 units of oxytocin per International Unit of 
ACTH, and being further characterized as containing as the active component of 
[a?] polypeptide of at least 24 amino acids having the following sequence from the 
N terminus of the molecule: Serine, Tyrosine, Serine, Methionine, Glutamic Acid, 
Histadine, Phenylalanine, Arginine, Tryptophan, Glycine, Lysine, Proline, Valine, 
Glycine, Lysine, Lysine, Arginine, Arginine, Proline, Valine, Lysine, Valine, 
Tryrosine, Proline. 
at 835, 

258 Id. at 836 . The court set out that: 

The examiner took the position that ... the parent contained insufficient 
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disclosure to support claim 4 in the manner required by the first paragraph of 35 
U.S.C. § 112. The board affirmed this rejection for two reasons. First, since the 
parent application lacked any structural description of the ACTH«extracts therein 
disclosed, the Board concluded that it could not be determined whether those 
products would meet the terms of claim 4, which recites a specific sequence of the 
first 24 amino acids. Appellant contented that the parent application inherently 
disclosed products meeting the terms of claim 4, even though appellant did not 
know the chemical structure of those products when the parent application was 
filed. Appellant cited several cases in support of the proposition that inherent 
disclosure is sufficient under 35 U.S.C. § 112 ... . We agree with appellant that 
this finding was erroneous .... The hog;extracted products disclosed in 
appellant's parent application must therefore have had the recited sequence. 
(emphasis added). 
2S9 Id. The court set out that: 

The board's second reason for holding the parent application insufficient to 
support claim 4 was that the products disclosed in the parent were insufficient to 
support a claim of the breadth of claim 4. On this point we agree with the boards 
. . . We do not know, from the record, the chemical structure of ACTHs of whales 
or other animals. Appellants' parent application, therefore, discloses no products, 
inherently or expressly, containing other than 39 amino acids, yet the claim 
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41 



meets the requirement of the first paragraph of 35 U.S.C. § 112 A and there is no need 
gto provide literal support" Torsi claime j d sequence ofjamizio acids where Hie .sequence 
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inherent in a product otherwise adequately described. Further, a finding that a 
specification fails to support a claim beyond the literal or inherent- description- within- 
the specification may be based on lack- of- enablement of one- skilled -in the -art- to- 
"make or obtain" the product as broadly claimed. 

DNA : sequence^: iaat rwilL sncade = any: ipoWeptide : =bjayiag : =an= ^mino : : aci4 rsaquaace, 
"'sufficiently duplicative' of EPO to possess the property of increasing production of 
red blood cells," was invalid under the enablement requirement of 35 U.S.C. § 112 
The court approved of reliance by the district court on the portion of the decision in 
Fisher holding that a claim directed to a polypeptide having at least jwenty- four 
amino acids of a sequence, without more, was inadequately supported under the 
enablement requirement of 35 U.S.C. $ 112 by a specification that disclosed only a 

thirty^nine amino acid jproduct.^ 

With respect to the EPO gene, ^ ^'^C m^^s^^d'^t the disclosure in 
Amgen's patent was not sufficient to enable the scope of DNA sequences claimed: 

Moreover, it is not necessary that a patent applicant test all the 
embodiments of his invention ... J what is necessary is that he provide a 
disclosure sufficient to enable one skilled in the art to carry out the 
invention commensurate with the scope of his claims. For DNA sequences, 
that means disclosing how to make and use enough sequences to justify 
grant of the claims sought. Amgen has not done that here. . . . What is 
relevant depends on the facts, and the facts here are that Amgen has not 

enabled the preparation of DNA sequences to support its al^encqmpassing 

claims. 

It is not sufficient having made the gene and a handful of analogues whose 
activity has not been clearly ascertained, to claim all possible genetic 
sequences that have EPCthke activity . ^U nder the circumstances, we find 
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includes all polypeptides, of the recited potency and purity, having at least 24 
amino acids in the chain -in the recited sequence. The parent specification does 
not enable one skilled in the art to make or obtain ACTHs with other than 39 
amino acids in the chain, and there has been no showing that one of ordinary skill 
would have known how to make or obtain such other ACTHs without undue 
experimentation.. ... In the latter situation, the statement is in no way 
'enabling! and hence lends no further support for the broad claim. 
Id (emphasis added). 

261 Amgen v. Chueai. 927F.2d 1200. 1212 (Fed. Cir. 1991). 

262 Id at 1214 . The court found that: 

The district court properly relied on Fisher in making its decision. In that 
case, an applicant was attempting to claim an adrenocorticotrophic hormone 
preparation containing a polypeptide having at least twenty;four amino acids of a 
specified sequence. Only a thirtyinine amino acid product was disclosed. The 
court found that applicant could not obtain claims that are insufficiently 
supported and hence not in compliance with the first paragraph of 35 U.S.C. § 
112. 



Id 
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were invalid for failure to meet the enablement requirement of 35 U.S.C. § 112. iBV'^ ' f 

The court also held that, under 35 U.S.C. § 102(g),26< conception of a gene, in the |hV«\v.> 
absence of a written description that is sufficient to distinguish it from other li|V-\l 
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that reduction to practice is defined as isolation of the gene 

Conception does not occur unless one has a mental picture of the structure 
of the chemical, or is able to define it by its method of preparation, its 
physical or chemical properties, or whatever characteristics sufficiently 
distinguish it. It is not sufficient to define it solely by its principal 
biological property, e.g., encoding human erythropoietin, because an alleged 
conception having no more specificity than that is simply a wish to know 
the identity of any material with that biological property. We hold that 
when an inventor is unable to envision the detailed constitution of a gene so 
as to distinguish it from other materials, as well as a method of obtaining it, i 
conception has not been achieved until reduction to practice has occurred, 
i.e., until alter the gene has been isolated. 266 

Therefore, the court in Amgen did not require a written description that listed 
all nucleic acid sequences to demonstrate conception of the invention under 35 U.S.C. 
§ 102(g). Rather, as in Fisher, with respect to providing sufficient disclosure of a 
claimed polypeptide under the first paragraph of 35 U.S.C. § 112, only isolation of the 
claimed product was required in the absence of an explicit claimed structure.^ 

Fiers v. Revel w .as an appeal from a decision by the Board jn a three-way 
interference between parties Fiers, Reve l, and Sugario.^ The party Fiers argued \ 
that an enabling method for obtaining a DNA sequence was essentially the same as ^\ 
proving conception of the DNA. 269 The court explained, in response to Fiers' 



263 Id. at 12; 14 (citations omitted). 

In analyzing the point, the court noted that under jj 102 (g) : 

A person is entitled to a patent unless;; (g) before the applicant's invention thereof 
the invention was made ... by another who had not abandoned, suppressed, or 
concealed it. In determining priority of invention there shall be considered not 
only the respective dates of conception and reduction to practice of the invention, 
but also the reasonable diligence of one who was first to conceive and last to 
reduce to practice, from a time prior to conception by the other. 
Id. at 1205. 

265 IcL at 1206 (emphasis added). 

**/rf.at 1214. 

** Fiers. StoV.M at 1167. 

269 Id. at 1168- The court stated that: 

f Appellant! Fiers suggests that the standard for proving conception of a DNA by 
its method of preparation is essentially the same as that for proving that the 
method is enabling. Fiers thus urges us to conclude that since his method was 
enabling for the DNA of the count, he conceived it in the United States when 
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argument, that a method of preparatio n jnay be employed as support for conception 

DNA. sequence ; when de^MA w 
produc t^b y^pgoce sa .claim ♦ . . . - 



Our statement in Amgen that conceptiorf-may-occi 
able ' to • define* " a" 'chemical " by Its' "method' of preparation ■ re^ufr e~s * that" the" 
DI^ A* \>e claimed by its method of preparation. We recognized that, in 
addition tn-heuig rlaimahle bv structure or physical properties, a chemical 



material can be claimed by means of a process. A producfcb y^rocess claim 
normally is an afte r-the-fact definition, used after one has obtained a 




material by a particular process. Before reduction to practice, conception 
only of a process for making a substance, without a conception of a 
structural or equivalent definition of that substance, can at most constitute 
a conception of the substance claimed as a process. 270 

The court's reasoning in Fiers arguably extends beyond that of Amgen in that, in 
Amgen, the method provided for obtaining the EPO gene was characterized as 
"uncertain." 271 In contrast to Amgen, there was evidence in the form of affidavits 
that one of ordinary skill in the art would have been able to isolate 6VIF DNA based ^ 
on the protocol proposed by Fiers without undue experimentation. 272 The court in \ 
Fiers, however, partitioned conception of DNA from any question of enablement 
related to its isolation, regardless of certainty- "Fiers has devoted a considerable 
portion of his briefs to arguing that his method was enabling. The issue here, 
however, is conception of the DNA of the count, not enablement. Enablement 
concerns teaching one of ordinary skill in the art how to practice the claimed 
invention" 273 

With respect to Fiers' argument for priority of invention, the court stated that, 
when a substance was claimed "per se," conception required a "structure, name, 
formula, or definitive chemical or physical properties." 274 

The jGAFC also _ denied entidement to an earlie r-filed Israeli application, to Revel, 
who was the second party in the thre e-way interference of Fiers.^ The court stated v 
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Gilbert and Sharp entered the country with the knowledge oi and detailed notes 
concerning, Fiers' process for obtaining it. «• 

™Id. at 1169. 

™ Amecsn v. Chueai. 927 F.2d 1200. 1207 (Fed. Cir. 1991). 

As expert testimony from both sides indicated, success in cloning the EPO gene 
was not assured until the gene was in fact isolated and its sequence known. 
Based on the uncertainties of the method and lack of information concerning the 
amino acid sequence of the EPO protein, the trial court was correct in concluding 
that neither party had an adequate conception of the DNA sequence until 
reduction to practice had been achieved^^ 

M, 

272 Fiers. 984 F.2d at 1168 ("Specifically, the Board determined that Fiers' disclosure of a 
method for isolating the DNA of the count, along with expert testimony that his method would have 
enabled one of ordinary skill in the art to produce that DNA . . . ."). 

2" Id. at 1169. 

274 Id. ("Conception of a substance claimed per se without reference to a process requires 
conception of its structure, name, formula, or definitive chemical or physical properties."). 

£Lffl »t "71. I 
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priority to an earlier filing date, the earlier application needed to have described the 
DNA "itself." 277 The court succinctly stated the jBoarda :z rond 
"Relying on Amgen, the Board concluded that the Israeli application was not 
enabling since Revel had not conceived the DNA of the count and Jtiogically, .one 
cannot . . . enable an invention that has not been conceived.™ 2 ™ . 

The court then paraphrased its explanation in Amgen stating that a method for 
obtaining claimed DNA, without more, is inadequate as a conception of the DNA and, 
therefore, enablement need not be considered- • 

As we stated in Amgen and reaffirmed above, such a disclosure just 
represents a wish, or arguably a plan, for obtaining the DNA. If a 
conception of a DNA requires a precise definition, such as by structure, 
formula, chemical name, or physical properties, as we have held, then a 
description also requires that degree of specificity. To paraphrase the 
Board, one cannot describe what one has not conceived. 279 

Revel asserted that, "since the language of the count refers to a DNA and not to 
a specific sequence, the specification need not describe the sequence of the DNA in 
order to satisfy the written description requirement." 280 The court disagreed and 
affirmed the fioard's .. reasoning that, a^ meet the 

description requirement wiQ necessarily vary depending on the nature of the \ 
invention claimed," there must be a demonstration to one skilled in the art that the 
inventor had possession of the claimed invention. 281 The court found that a count, or 
claim, that covers all DNAs that code for a specific protein is analogous to a single 
means claim and, therefore, does not comply with the first paragraph of 35 U.S.C. $ 
112: 

Because the count at issue purports to cover all DNAs that code for 
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before it has arrived. 282 



| 276 Id. ("Revel's application does not even demonstrate that the disclosed method actually leads 

to the DNA, and thus that he had possession of the invention, since it only discloses a clone that 
might be used to obtain mRNA coding for 8;IF."). 

277 Id. at 1170^:71 ("An adequate written description of DNA requires more than a mere 
statement that it is part of the invention and reference to a potential method for isolating it; what is 
required is a description of the DNA itself. Revel's specification does not do that."). 

™Id. at 1170. 

™Id. at 1171. 

2K>/^. at 1170. 

28i fa (" Qn reconsideration , the Board correctly set forth the legal standard for 

sufficiency of description* the specification of Revel's Israeli application must 
reasonably convey Oto the artisan that the inventor had possession at the time of 
j the . . . claimed subject matter. " (citations omitted)). 

282 Id at 1171. 



[1:1Q0 A 2W23 — 1 Universi t y of Roch e s t er v. G.D. Sea i l e & Co. 



The court declined to decide whether Revel's prior application was enabling. 283 



x The third party, Sugano, won the interfere i nee .Sugano was the first to not*; 
only, describe a method for ob taining a DNA that coded for ftrlF. . but .to >, provide 
complete and correct -nucleotide -sequence,.- thereby-meeting- -the. -enablement- and 

writtert^scriptien-re quirenltnts, resp e ctiv e ly. ? i 8fi ■ ^\ 

Therefore, the courts in both Amgen and Fiers required actual reduction to | 
practice in order to demonstrate conception. However, in both cases, the requirement 
for actual reduction to practice rested. P.n. lack. of. .QgrtMAty.. of. .9J9ttltanie&t..Q£.th0. 
dieclosed- method.23£ . In Amgen y - certainty- -was . demonstrated- by. isolation- of -the. genel 
"envisionanf - the - detailed- Consti tution of a gen e s o as to di stingui s h it frem-other^ 
materia ls" did nut mailer if the "method of obtaining il" had been proven tu wurk.^ 



In tiers; 'the" "court," "in e^laining "Amgen, "also "appeared" to feTy'6'ncertainty and, thus* 
enablement, to demonstrate conception by stating that, "[blefore reduction to 
practice, conception only of a process for making a substance, without a conception of 
a structural or equivalent definition of that substance, can at most constitute a 
conception of the substance claimed as a process." 288 In other words, the court 
implied that conception of a claimed substance would hinge on the success of the 
process described for obtaining it. Therefore, despite the statement in Fiers 
specifically partitioning sufficiency of a written description (to demonstrate 
conception) from enablement, the court, by relying on actual reduction to practice, 
was consistent with the court in Amgen, which found actual reduction to practice and 
inherency to be a substitute for physical description of a composition, such as DNA.2S2 
The T CAFC in The Regents of the University of Cahfornia y. Eli JtiJly and <7o 
held that claims directed to cDNA sequences other than those found in rats were \ 
invalid for failure to comply with the written description requirement o f the first \ 
paragraph of 35 U.S.C. § ll ^ 291 However, with respect to each of the claims, the 



283 Id at 1171 n.12 ("In light of our disposition of the written description question, we do not 
address whether Revel's Israeli application satisfies the enablement requirement."). 
™/rf.at 1172, 
285 Id 

We conclude that Sugano is entitled to rely on his disclosure as enabling 
since it sets forth a detailed teaching of a method for obtaining a DNA coding for 
B:IF .... We also conclude that Sugano's application satisfies the written 
description requirement since it sets forth the complete and correct nucleotide 
sequence of a DNA coding for 8;IF and thus Iconveyfs] with reasonable clarity to 
those skilled in the art that, as of the filing date sought, [Sugano] was in 
possession of the [DNA coding for 8;IF].l 
Id (citations omitted), 

™ See eeneraliv Ameen v. Chugai. 927 F.i>d 1200 (Fed. Cir. 1991); Fiers v. Revel 984 F.2d 
1164 (Fed, Cir. 1993). 

™ Arntren. 927 F.2d at 1206. 

We hold that when an inventor is able to envision the detail constitution of a gene 
so as to distinguish it from other materials, as well as a method for obtaining it, 
conception has not been achieved until reduction to practice has occurred, ie., 
until after the gene has been isolated. 



Id 



288 Fiers. 984 F.2d at 1169. 
» Id 

2)1 Regents of the Univ. of California v. 



Eli Lilly and Co.. 119 F.3d 1559. 1566 (Fed. Cir. 1997J. 
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analysis employed by the court included considerations of enablement.222 For 
example, U.S. Patent ^. 4.652^525 ("t^ 

included a claim (claim 5) directed to human insulin cDNA. 294 A constructive 
example (Example 6) of the specification of the '525 patent provided a general 
method for obtaining human cDNA, which was actually employed to isolate rat 
cDNA, and provided the amino acid sequences of human insulin A and B chains. 295 
*The" cqurt^' stated " that "providing ' an * "amyio [ acid" " s^uence]]is" not[]an "adequate] " 
description of a specific cDNA sequence due to redundancy of the genetic code. 29 ^ The " 
court also-«tated -that,. absent ihe-cDNA -sequence,- a general-method- (e.g. r -the-method- . 
employed by Example 6) did not meet the written description requirement of the first 
paragraofrof3g-U;S,^ 

Thi ^example . [Example. 6^. however,. provides only . a _ general . me thod .for . 
obtaining human cDNA (it incorporates by reference the method used to 
obtain the rat cDNA) along with the amino acid sequences of human insulin 
A and B chains. Whether or not it provides an enabling disclosure, it does 
not provide a written description of the cDNA encoding human insulin, 
which is necessary to provide a written description of the subject matter of 
claim 5. 297 

Therefore, the ^APC held that a "written description of the cDNA encoding 
human insulin" must be provide d. 2 ^ 3 This holding was premised on the fact that, 
"whether or not" the specification was enabling, a "general method" for isolating the 
human cDNA and the corresponding human amino acid sequences w^msi^cient to 
support the specific claimed subject matter of cDNA encoding those amino acid 
sequences. 222 In other words, the written description did not permit one skilled in 

Id. at 1567. 

293 The other patent at issue was U.S. Patent No. 4.431.740 (issued February 14. 1934) . 

294 The claims of the '525 patent are as follows: 

1. A recombinant plasmid replicable in procaryotic host containing within its 
nucleotide sequence a subsequence having the structure of the reverse transcript 
of an mRNA of a vertebrate, which mRNA encodes insulin. 

2. A recombinant procaryotic microorganism modified to contain a nucleotide 
sequence having the structure of the reverse r transcript of an mRNA of a . < 
vertebrate, which mRNA encodes insulin. 

3. The bacterium Escherichia coli which has been modified to contain a nucleotide 
sequence having the structure of and transcribed from the rat gene for insulin. 

4. A microorganism according to claim 2 wherein the vertebrate is a mammal. 

5. A microorganism according to claim 2 wherein the vertebrate is a human. 

6. A plasmid according to claim 1 comprising a plasmid containing at least one 
genetic determinant of col El. 

7. A microorganism according to claim 2 comprising a strain of Escherichia coli 
V.g. Patent No. 4,692 , 525 (issu ed March 24, 19871 

295 See Resents of Wie Univ. of California. 119 F.3d at 1567 ("The patent describes a method of 
obtaining this [human insulin encoding] cDNA by means of a constructive example, Example 6."). 

296 Id. ("We had previously held that a claim to a specific DNA is not made obvious by mere 
knowledge of a desired protein sequence and methods for generating the DNA that encodes that 
protein." (citations omitted)) . 

297 Id. 

2JW i 
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the art to comprehend, on the basis of the general teachings, the specific; human 
cDNA sequences encoding human insulin A and B chains. 

Claims 1, 2, 4, ^.and 7, wMch were 
vertebrate or mammalian cDNA, also were held to be invalid for failure to meet the 
written description requirement of 35 U.S.C. $ 112. 800 The basis for the court's 
opinion was that the specification, which adequately described only one species (rat) 
of a genus (vertebrate or mammalian), did not meet the written description 
requirement with respect to the generic claims. 301 The T CAFC stated that 
nevertheless, a genus of cDNA's could be adequately supported by providing a 
"representative number of cDNA's," thereby linking sufficiency of the written 
description with enablement: 

A description of a genus of cDNAs may be achieved by means of a 
recitation of a representative number of cDNAs, defined by nucleotide 
sequence, falling within the scope of the genus or of a recitation of 
structural features common to the members of the genus, which features 
constitute a substantial portion of the genus. This is analogous to 
enablement to the genus under § 112, fll, by showing the enablement of a 
representative number of species within the genus. 802 
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The clear implication of the court was that the specification might have met the | { Deleted: court 
written description requirement other than by providing nucleotide sequences, even 
though, in the case of the '525 patent, the written description requirement had not 
been met- 



□ 



We will not speculate in what other ways a broad genus of genetic material 
may be properly described, but it is clear to us, as it was to the district 
court, that the claimed genera of vertebrate and mammal cDNA are not 
described by the general language of the '525 patent's written description 
supported only by the specific nucleotide sequence of rat insulin. 303 



H* { Formatted: Indent: First line: 0" ) 



Therefore, even in Lolly, despite clear statements that the written description 
requirement .was distinct from the enablement requirement and that the decision 
was based entirely on failure to meet the written description requirement, the court 
employed reasoning associated with enablement and did not rule out alternatives to 
providing exact claimed structures.^ 

In Enzo Biochem, Inc. v. Ge^Probe^ Jberemafter Enzo Zjj the JCAFC ._on ^ 
rehearing, vacated its own prior decision 306 and reversed a district court decision 
granting summary judgment invalidating claims 1 through 6 of U.S. Patent No. 
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3«> Id at 1569. 

301 Id. at 1568 ("We agree with Lilly that the claims are invalid. Contrary to the UCs 
argument, a description of rat insulin cDNA is not a description of the broad classes of vertebrate or 
mammalian insulin cDNA."), 

302 Id. at 1569_(citations omitted). 

303 /</. 

£f Id 

305 Enzo Biochem. Inc. v. Gen-Probe Inc.. 323 F.3d 956 (Fed. Cir. 2002) [hereinafter Enzo /I . 
sos Enzo Biochem. Inc. v. Gen Probe Inc.. 285 F.3d 1013 (Fed. Cir. 2002) [hereinafter Enzo 7/|. 
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patent \vere- directed: to -eompositions^nd^made; reference- to dejposifes -at 
Type Cul£ure - fkrilecttaff (ATCCh 808 : : : Clatos : : 5 "aid : 6 : were 1 ejected : to* assays ■ for 
detection of N. gonorrhoeae using the composition of claim 1 or a variant of that 
composition. 309 

In vacating their previous decision and reversing the decision of the district ffij^;.y ; rr" 77. 

court, the ,Q AFC adopted .guideUnes issued wMtf 

a specification can meet the written description requirement of 35 U.S.C. § 112 by Deleted: 

including a "disclosure of sufficiently detailed, relevant identifying characteristics . . . 1rS^[t 

i.e., complete or partial structure, other physical and/or chemical properties, 
functional characteristics when coupled with a known or disclosed correlation Is 
between function and structure, or some combination of such characteristics." 310 Ij 
However, prior to applying the standard adopted from the USPTO's Guidelines, the I' 
court first considered, as an issue of first impression, "[w]hether reference to a \ 
deposit of a nucleotide sequence may adequately describe that sequence . . . " 9U The 
court held that, considering the "history of biological deposits for patent purposes, the 
goals of patent law, and the practical difficulties of describing unique biological 
materials in a written description," 812 reference to a "deposit in a public depository, 
which makes its contents accessible to the public when it is not otherwise available in 
written form," did satisfy . the written description requirement under the first 
paragraph of 35 U.S.C. § 112. 313 The court relied on rules promulgated by the 
USPTO stating that a deposit is not necessary when "it is known and readily 
available to the public or can be made or isolated without undue experimentation," 314 
to thereby allow deposited claimed subject matter as an alternative where the 
"invention s that cannot reasonably be enabled by a description in written form in the 
specification . . . .Z! 315 Further, the court stated that the written description 
requirement is met by what a person skilled in the art can obtain from the deposit: 
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•™ Enzo I 323 F.3d at 960. r r 

306 Id at 961-62. 
a® Id. at 962. 
™Id. at 964, 

3,1 Id. at 965 ("Whether reference to a deposit of a nucleotide sequence may adequately 
describe that sequence is an issue of first impression in this court."). 
312 Id. 

™ -Enrol 323 FM at 96?, 

[W]e hold that reference in the specification to a deposit in a public depository, 
which makes its contents accessible to the public when it is not otherwise 
available in written form, constitutes an adequate description of the deposited 
material sufficient to comply with the written description requirement of § 112, ^1 




Id. C[ 



( ... T2301 



first paragraph 



J23U 



02321 



Deleted: Federal Circuit 



Deleted: United Sta tes PafTTT^j]"] 
{ Inserted: "US ) 



( Inserted: 



ZD 



1. 



Id 



3w Id. {quoting31 C.F.R. § 1.802(b)). 

315 Id. ("Inventions that cannot reasonably be enabled by a description in written form in the 
specification, but that otherwise meet the requirements for patent protection, may be described in 
surrogate form by a deposit that is incorporated by reference into the specification."). 
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A person of skill in the art, reading the accession numbers in the patent 
specification, can obtain the claimed sequences from the ATCC depository 
by following the appropriate techniques to excise the nucleotide sequences 
from the deposited organisms containing those sequences .... The 
sequences are thus accessible from the disclosure in the specification .... 
We therefore agree with Enzo that reference in the specification to deposits 
of nucleotide sequences de scribe those .sequences ' ^Mcieiitly" "to the public" 



iox.piTOQSfis_oimej3tm^ 



Even glaims 4 and 6 of t&e '659 patent, which were not limited to the deposited 
sequences, but also included^ "subsequence^ \ 
as having "greater than about twelve nucleotides), 817 did not necessarily fail the t 
written description requirement of 35 U.S.C. $ 112. According to the court, adequacy 
of the written description as applied to ^ajms 4 and 6 was 

"whether a person of skill in the art would glean from the written description, 
including information obtainable from the deposits of the claimed sequences, 
subsequences, mutated variants, and mixtures sufficient to demonstrate possession 
of the generic scope of the claims." 318 

{The court conclude d with a statement of .public policy behind the written ^ 
de scrip tion requirement of Jo U.S.C. j 112 that is consistent with the Hstorica^ quid < 
pro quo of enabling the public to make and use the claimed invention in exchange for ^ 
a limited period of Rxcluaivity^ As stated 

<^\fate\te&n^*^9M 

«. - .--- r — - - 

For biological inventions, for which providing a description in written form 
is not practicable, one may nevertheless comply with the written description 
requirement by publicly depositing the biological material, as we have held 
today. That compliance is grounded on the fact of the deposit and the 
accession number in the specification, not because a reduction to practice 
has occurred. Such description is the quid pro quo of the patent system; the 

3»6 jd. at 965^66. 
317 Id. at 966. 

Claim 4 is directed to nucleotide sequences that are selected from the group- of 
three deposited sequences, discrete nucleotide subsequences thereof .... Claim 6 
is also similarly directed to three deposited sequences and subsequences .... The 
specification defines a subsequence nonspecifically as a nucleotide sequence 
Igreater than about 12 nucleotides.! 
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318 

On the other hand, because the deposited sequences [of claims 4 and 6] are 
described by virtue of a reference to their having been deposited, it may well be 
that various subsequences, mutations, and mixtures of those sequences are also 
described to one of skill in the art .... . On remand, the court should determine 
whether a person of skill in the art would glean from the written description, 
including information obtainable from the deposits of the claimed sequences, 
subsequences, mutated variants, and mixtures sufficient to demonstrate 
possession of the generic scope of the claimSi 

•™ Id. at 970. 
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The order granting the petition-for-rene 
included three concumn^-and^wcrcHssentmg^ 
I the majority opinion reversing the decision by the district court, concurr ed with 
Judge Ne wman in the decision not to rehear the case en banc. 329 He stated that tEeV jj 
plain language of the first paragraph set forth both a written description xy&fc 
requirement and an enablement requirement, and he supported— this-grammatical- VvlV-V; 
I interpretation with a historical interpretation of the statufe^pporting^he-existence~\\^ 
of a written description requiremen t prior to imposition of a requirement for claims- 
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The..atatute. .states . that. .the. mvention. must. he. .described. ...That _is._hasic 
patent la w v the - quid- pro -quo-iox- -the- ^rant crf^a-pate-nt— Judge-^ader-notes 
that historically the written description requirement served a purpose when 
claims were not required. While that may be correct, when the statute 
began requiring claims, it was not amended to delete the requirement; note 
the comma between the description requirement and the enablement 
provision, and the "and" that follows the comma. Judge Rich, whom Judge 
Rader cites, was in fact one of the earliest interpreters of the statue as 
having separate enablement and written description requirements ^, 324 
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321 JjL (emphasis added),, Interestingly, the distinction in this passage between reduction to 
practice and deposit makes sense only in the context of enablement. As was discussed in Amgen, 
reduction to practice was a substitute for providing both a physical description and an enabling 
method^ 

We hold that when an inventor is unable to envision the detailed constitution of a 
gene so as to distinguish it from other materials, as well as a method of obtaining 
it, conception has not been achieved until reduction to practice has occurred; i.e., 
until after the gene has been isolated* 
Amsren v. Chugat 927 F.2d 1200. 1206 (Fed. Cir. 1991). In neither Amgen nor Enzo J was there a 
physical description of the product. However, in Amgen, reduction to practice ensured that the 
method described was enabling^ "[blefore reduction to practice, conception only of a process for 
making a substance, without a conception of a structural or equivalent definition of a substance, can 
at most constitute a conception of the substance claimed as a process!* Fiers v. Revel. 984 F.2d 1164, 

116 5 (C.A.F.C. 1993) (emphasis added), In % Enzo _L lack of an enabling description in the 

specification was presumed - "(il nventions that cannot be enabled by a description in written form in 
the specification, but that otherwise meet requirements for patent protection, may be described in 
surrogate form by a deposit that is incorporated by reference into the specification. " Enzo 1. 323 
F.3d at 965. Therefore, while Amgen required reduction to practice to ensure that the method 
provided was enabling, in Enzo J, a publicly accessible deposit was a substitute for lack of an 
enabling written description. It should be noted, however, that the court in Fiers stated that 
conception also can occur in the absence of reference to a process : * (clonceotion of a substance 
claimed perse without reference to a process requires conception of its structure, name, formula, or 
definitive chemical or physical properties." Fiers, 984 F.2d at 1169. 

322 Fnzo L 323 F.3d at 970:. 
3» Id. at 971. 

324 Id. (Lourie, J., concurring) (citations omitted) Prior to a specific requirement that the 
application particularly point out and distinctly claim the invention, the relevant patent statute did, 
in fact, require a written description of the invention, separate and apart from a requirement that 
the applicant enable the invention. Upon inclusion of a requirement of claims in the Patent Act of 
1870, the language of the statute was changed so that sufficiency of written description was "as to" 
enable one skilled in the art to make and use the invention. With respect to § 112, Hi, Judge 
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Judge Lourie disagreed with Judge Rader Jxho,..iJX..a 1 ausj3e^ 
(discussed infra), argued that the written description in Abe first paragraph, "until * 
EnzoJf - was-limited-to- establishing- -priority- of -invfentiorr^— He^feo--4isa^eeti^with^\ 
the prcfocfcition-thatxiaims-^^ in t he ^et^calaoirHfy^^^ 

meet w ^th the written description requirement 326 and employed an example set forth 
in DiLeone to distinguish enablement from the written descriptionjre.quirement or 
^e-ba8is4^at-the-toadth-^&nafel&ment of a dis&k)8Uje-may-^X€ee cUthe-breadth-QfJ 
the invention disclosed. 327 Moreover, Judge Lourie asserted his behef that an I 
elevated interest in patent- -protection- and -a- -consequent -effort to- -broaden- -claim- -J 
coverage "beyond" lifers 

renewe S focus on the written description requirement 

It is said that applying the written description requirement outside of 
the priority context was novel until several years ago. Maybe so, maybe 
not; certainl y m such a holding wag'norpf^ 

New interpretations of old statutes in light of new fact situations occur all 
the time. I believe these issues have arisen in recent years for the same 
reason that more doctrine of equivalents issues are in the courts viz., 
because perceptions that patents are stronger tempt patent owners to try to 
assert their patents beyond the original intentions of the inventors and 
their attorney. That is why the issues are being raised and that is why we 



Markey provided a grammatical construction in his dissent in In re Barker, which is different from 
Judge Lourie's in that it supports the proposition that enablement is the measure by which 
satisfaction of the "written description requirement" is met! 

Section 112, first paragraph, is a simple sentence, with a comma after lit," making 
the phrase *in such full *** the samel a modifier of both objects of the verb 
"contain. I All before that comma prescribes what shall be described. The phrase 
following the comma prescribes Ac wand for whom it shall be described. 
In re Barker. 559 F.2d 538, 594 (C. C.P.A. 1977) (emphasis original). 

325 Enzo -I 323 F.3d at 972 (Lourie, J., concurring) . In his concurrence. Judge Lourie noted: 
Moreover, the dissenters would limit the requirement, to the extent that 
they credit the written description portion of the statute as being a separate 
..requirement at all, to priority issues. The statute does not say 'la, written 
description of the invention for purposes of policing priority Z 
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826 Id. In discussing the dissenting opinion. Judge Lourie commented- 

I believe that the dissenters miss the point in seeing this case as involving 
an original claim or in ipsis verbis issue. There is no question that an original 
claim is part of the specification .... It is incorrect that the mere appearance of 
vague claim language in an original claim or as part of the specification 
necessarily satisfies the written description requirement or shows possession of a 
generic invention. 
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327 Id. at 97 5. In his concurrence. Judge Lourie noted: 

Some commentators have had difficulty in understanding how one may have 
enabled an invention, but not described it. The believe they must coincide. As an 
example of how the written description and enablement provisions differ in 
chemistry, however, one may readily have enabled the making of an invention, 
but still not have described it. . . . 
Id. (citations omitted). 
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Judge- Newman,- -in -a- -separate - concurring- opinion, - -reiterated - Judge - Lourie-'s- 
opinion that the written description requirement has never been limited to 
antedating prior art or establishing priority. 329 She also stated that deposit of p , _____ 
biological material to meet the written description requirement, was "a special case" l^/ l^rniatted 
that does not change the statutory requirement for a written description. 330 1 v— ™" 

Judge Rader, in his dissent, firmly stated that, until Judge Rich's opinion in 
Ruschig, there was no written description requirement apart from enablement, and 
that the new "written description" (?WD'!) doctrine created in Ruschig extended no 
further than "policing priority"- 
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In 1967, in Jn re R^schjj^ this court's .predecessor created for tihe _first 
time a new WD doctrine to enforce priority. In the context of a new claim 
added "[albout a year after the present application was filed," the Ruschig 
court sought to determine "whether [the new] claim 13 is supported by the 
disclosure of appellants' application." Jtether than use § ^13^ however, 
Ruschig assigned the role of policing priority to § ii2_. T . , . _To deal .with new 
matter in the claims, the court calrXyed a new WD dqctarine out of the § 112 
enablement requirement. 

In any event, the WD doctrine, at its inception, had a very clear 
function preventing new matter from creeping into claim amendments. 331 

Judge Rader then recited language employed by Judge Rich from Ruschig; 
correlating satisfaction of the written description requirement with a demonstration 
in the specification that the claimed invention was in the possession of the inventor 

at the time of filing, as the means for policing priority. In resolving this question, 

Judge Rich stated again the purpose of the wri tte n description^ "The function of the 
description requirement is to ensure that the inventor had possession, as of the filing 
date of the application relied on, of the specific subject matter later claimed by 
him ."323 ^n sum , Jhe wri tten description was a new matter doctrine, a priority 
policeman, • f_ 

a28/rf. at 971-72. 

329 Id. at 975 (Newman, J., concurring) . Judge Newman concurred by stating: 

The theory of the dissent that a description of the invention is not needed in order 
to support the claims, but serves only to antedate prior art or establish priority in 
an interference, is a dramatic innovation in the theory and practice of patents. It 
has never been the sole purpose of the description requirement, and negates not 
only the logic but the history of patent practice. 

M 

330 Id. _CAnd the special case of the biological deposit is a method of complying with the 
statutory requirements, as the panel now confirms; this expedient implements the statute for the 
special subject matter, but does not change it."). 

aai Enzo -I. 323 F.3d at 977-78 (Rader. J., dissenting) (citations omitted) . 
* 12 Id. at 978 (citations omitted). 

334 Id. at 979 (stating that the "WD. the equivalent of the statutory new matter doctrine, 
simply has no application to claims without priority problems."). 
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{Judge Rader's dissent siMmmzed the written description requirement as "the 



equivalent of the statutory new matter doctrine," further stating that the doctrine 
"simply has no application to claims without priority problems." 334 He presented an 
appendix that "will briefly explicate all written description cases from its creation in 
1967 in the Court of Customs and Patent Appeals to the present." 335 Judge Rader 
stated that the "appendix shows that only two cases, this ENZO case and the 1997 
LILLY case have purported to apply the doctrine outside its purpose and function." 336 

According to Judge Rader, the ^AFC in Liljy extended the written description 
requirement beyond priority to a "free-standing _ ducdraure v raftV^?™???^. JfH^il^^H^kW- . 
for enablement: f< In sum, the Lilly opinion does not test a later claim amendment 
against the specification for priority, but asserts a new free^ston^g_<^sclosure_ 
requirement in place of the statutory standard ofenablement.il 337 

In essence, according to Judge Rader, the "free-standing :.writejQ_^8mption_ 
requirement" established by the Lilly opinion was far more exacting than that of 
enablement and, in effect, eliminated enablement as a means for "demarking the 
boundary between pioneer inventions and patentable improvements": 



Replacement of enablement doctrines with an Undefined ...general 
disclosure doctrine of WD imperils the integrity of the patent system. 
Enablement, arguably the most important patent doctrine after 
obviousness, has many important applications. Beyond mere adequacy of 
disclosure, it serves as the line of demarcation between the visionary 
theorist (adds nothing to the useful arts) and the visionary pioneer 
(contributes to the useful arts)^and also serves to 

demarking the boundary between pioneer inventions and patentable 
improvements, The WD .possession test cannot perform these functions. 338 



As an indication of the breadth of opinion that exists within the JCAFC, Judge 
Linn, in his dissenting opinion, with whom Judges Rader and Gajarsa joined, stated 
that the sufficiency of a written description is measured in terms of enablement and 
that "possession of the invention" is not relevant- 

35 U.S.C. § 112 requires a written description of the invention, but the 
- measure of the sufficiency of that written description in meeting the 
conditions of patentability in paragraph 1 of that statute, . . . should 
depend solely on whether it enables any person skilled in the art to which 
that invention pertains to make and use the claimed invention........ 

Satisfaction of the "possession of the invention" test simply is not 
relevant. 339 

Patent validity in view of the written description requirement o f the first 
paragraph of 35 U.S.C. § 1 1^, was again at issue in Awgen, Inc. y. Hoechst Marion 
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336 id, 

337 Id. at 980. 

338 Id. at 982 (citations omitted) . 

339 Id. at 988 (Linn, J., dissenting) (citations omitted) . 
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Roussel, Inc. 340 The technology on which the patent was based included expression^ \ s \ ( Peleted; - 
human erythropoietin* tEPO^ 

such as a Chinese hamster ovary ("CHO") cell.24i Transkaryotic Therapies, Inc. \\ % 
CTKTj), a cc^efendant^.^ \ 
wherein an endogenous EPO gene in a human ceii was activated by txansfection of A Vfr, 
nor yxoding DNA into the chromosome of the ceil.SAg The claims that Am gen was \ V\\ 
asserting against TKT did not include a limitation that the EPO was encoded by \ A 
DNA that was exogenous to the host cell. 343 Despite the fact that all of the examples \ v. 
employed nor mative DNA to encode EPO, and that TKTs method of activating \\ 
unexpressed, endogenous- DNA- was- not -taught; -the- £AF€^heHH^t^e^ \ • 

supported claims "that were" "broad enough" to " include" TKTs human-EPO product". 344 ^ 
The £j £f^distinguished an earlier case relied upon by TKT, (Jen try UaJJery, Inc. v. 
Berkline . Cb mH 5 on the basis that, unlike Gentry, where the claimed invention was 
broader than the invention disclosed, Amgen's claimed inve ntion r ra-noiy-naturally' 
occurring humarJF33FQ-eempesi&©n- was-not-br^adened-by-^mbraciiig^--produet-mad 
by a method that differed from the method disclosed in the patent specification. 346 

Separately, TKT stated that claim limitations that included "non^naturally 
occurring ^ "vertebrate cells," and "mammalian cellsj* excluded expression of human 
EPO in human cells. 3 ^ TKT argued that a variation in language between the 
specification and a 1993 application to which priority was claimed indicated an 
intentional exclusion from the specification of expressing human DNA in human host 
cells.24B The £AFC concluded that there was no such intent and that these terms 
should be "construed ... in a manner consistent with their plain meaning." 349 The 



»* See Amgen. Inc. v. Hoechst Marion Roussel Inc., 314 F. 3d 1313 (Fed. Cir. 2003). 
Id. at 1321. 
Id. at 1325. 

343 Id. ("None of the asserted claims contain either an 'exogenous DNA' or 'endogenous DNA' 
limitation"). 

344 Id, at 1334 ("In light of the evidentiary record and TKTs inability to persuade us that 
precedent requires a contrary result, we hold that the district court's finding that Amgen satisfied 
the written description requirement is not clearly erroneous."). 

3« Gentry Gallerv. Inc. v. Berkline Corp.. 134 F.3d 1473 (Fed. Cir. 1998). 

3« Amgen, Inc. v. Hoechst. Marion Roussel Inc., 314 F. 3d 1313. 1333 (Fed. Cir. 2003). In 
distinguishing the Hoechst case from the Gentrvcs.se. the court noted: 

The undisclosed element leading to the Gentry court's holding of invalidity for 
lack of an adequate description was a location for the controls other than on the 
consolenleading to a different and undescribed product .... Amgen's invention is 
not the location of the control sequences and EPO DNA in relation to the cell, but 
rather the production of human EPO using those sequences. Thus, the 
undisclosed element TKT urges invalidates Amgen's product claims is a different 
method (endogenous activation) of making the claimed compositions. But, as the 
district court noted, under our precedent the patentee need only describe the 
invention as claimed, and need not describe an unclaimed method of making the 
claimed product, (citations omitted) (emphasis in original). 
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As a result, we are satisfied that the terms Inoninaturally 
occurring. ""vertebrate" and "mammalian" should be construed as they were by the 
district court, in a manner consistent with their plain meaning. Accordingly, we 
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court, in essence, relied on what it considered to be a "fair reading" of the 
specification. 350 

TKT also asserted that ^^_gep _"i^ed ; ^ 
veVtebrate- and mammalian- celis." 361 - - The-,^^;^^™^^;^^ fhddjngr by fthe^ district" * 
court that the specification adequately described use within the broad classifications ' 
and attributed support for the district court tolding'to'e^eA^ 

that," afEhpii glf 'twere '"might Tie \ ]mfcpr "differences^ *m '_ applying" [the [method" of "the [\ 
disctojseiej^mp^^ 

skill couM- -'easily' . figure- -out- those - differences- in- -methodology Z^ 36 ?- - -The- t CAFC - held 

that Lilly and Enzojwere "inapposite to this case because the claimed terms at issue ll jft'A' j Deleted: " 
here are not new or unknown biological materials that ordinarily skilled artisans 
would easily miscomprehend." 353 \ \> 

Adequacy of written description, therefore, was based on a "fair reading" of the 
specification to embrace expression of human DNA in human cells, without 
limitation to method as applied to composition claims, and without regard to "minor 
differences" that could be "easily" figured out by one of ordinary skill in the art with 
respect to claims genericaUy embracing "mammalian" and "vertebrate" cells 
exemplified only by CHO and COS^l : , cells. _ Therefore,^ 

description was measured by enablement of one skilled in the art reading the 
specification to comprehend claim scope (despite a source of # clamed product \ 
different from that of the specification). 

Judge Clevenger, in his dissent, stated that the majority improperly • 
distinguished Lilly by relying on the fact that the DNA sequences employed in 
Amgen's claims were not nove l,. 35 ; 1 Recording to the dissent^ Lillv stood for two 
broader principles of the written description requirement: "that in haec verba 1 
description of broadly described generic subject matter may not suffice to describe the \ 
subject matter of that particular claim, and that disclosure of a species may not 
suffice to describe a genus . . . . M365 Judge Clevenger also stated that the failure of 
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reject TKTs attempt to limit the scope of the asserted claims under an unduly 
constricted reading of the specification 

Id 

350 Id. ("Moreover, the specification can fairly be read to, if not expressly, disclose the use of 
human DNA in human host cells in culture . . . ."). 
36« Id. at 1331. 
362 Id 

IT] he court weighed the testimony and found that the evidence showed that the 
descriptions adequately described to those of ordinary skill in the art in 1984 the 
use of the broad class of available mammalian and vertebrate cells to produce the 
claimed high levels of human EPO in culture .... In so doing, the court credited 
in particular the testimony of Amgen's expert, Dr. Harvey Lodish, who testified, 
among other things, that there might be 'minor differences' in applying the 
method of the disclosed examples (utilizing CHO and COSil (monkey) cells) to 
any vertebrate or mammalian cells, but that those of ordinary skill could 'easily' 
figure out those differences in methodology, (citations omitted^ 
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Eh Lilly articulated two principles of the written description requirement^ that in 
haec verba description of broadly described generic subject matter may not suffice 
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Gentry to recite location of control means was similar to the failure of Amgen's 
claims to recite the limitation that the DNA was "exogenous "!??._ .Judge. .Clevenger^ 
cautioned that failure to hmit claims to ernhodiments-e^hGit-to^e^pe^ 
would make analysis under- -35 -U,3, C. • $- -1-12 more -difiicul&- "But -the- absence- of - such 
limitations - m ust- weigh -heavily int he -section- -11:2- inquiry,- else- we- hold- that- claims 
becom e more resistant to written~desCriptiOd ^challenges the more broadly o^afteit^ ^ 
they^?^ „ j fcA 
As discussed above, the permissible breadth of Amgen's claims was limited DV ^\V\ 
the specification^s-understood4jy-^e^kiHed^-aitisan.^§ — ^However^he-^ncera-.il^ 
expre sSed- by- J ddge-Glevenger-regarding^e^otenfe 

could have been allayed by more explicitly basing satisfaction of the first paragraph 
of 35 U.S.Q.J^, 11 2> on whether one skilled in the art would understand, in view of the t \ufr \ 

method ex ressing endogenous DN \ s well as the exogenous embod ments \|\Mf 
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specification as written, that the scope of the invention included a recombinant 
method expressing endogenous DNA, as well as the exogenous embodiments 
particularly described , In other words, did the written description of the invention 

"enable any person skilled in the art to which it pertains, or with which it is most „„^_ JTO __ r ^_ 

nearly connected, to make and use the sameg^ 0 Formatted: Font: Century, 8.5 pt ] 
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of 35 U.S T C. § 1 12 tha t began in Enzo^J continued in Moba B. V. v. Diamond \ |i Juw*- 
Automation Inc. BG1 The majority opinion of Moba held that a claim broad enough to \VSa\\\' F 
encompass a machine that lifted eggs from a conveyer was not necessarily invalid for ••'fi'iWl Deleted 
failure to meet the written description requirement o f the first paragraph of 35 
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U.S.C. § 11 %, despite the fact that the embodiment of the claimed invention in the ;;'.«\\ 
specification did not lift eggs directly from a moving conveyer, but rather caused \ \;\\| \ 
them to be stopped prior to lifting. 862 In particular, the term "holding station" in the ^V;| i 

to describe the subject matter of that particular claim,: and that disclosure of a 
species may not suffice to describe a genusn. The district court followed neither of 
these principles here, and the majority, dismissing Eli Lilly on the grounds that 
no undisclosed DNA molecule appears in this case, verges on confining Eli Lilly to 
its facts.. 
Id. (citations omitted). 

356 I d. In discussing Gentry. Judge Clevenger noted: 

Nor am I convinced that the district court's approach was faithful to Gentry 
Gallery .... Because the specification failed to discloser any location for the 
controls other than on the console, those claims that lacked such limitations were 
invalid under § 112, U 1 ... . The question here is similar: whether the claims fail 
the written description requirement for lack of "exogenous DNA" limitations, 
because the specification discloses only the exogenous DNA technology that was 
state of the art in 1984, 
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*»Id at 1332. 

360 35 U.S.C. § 112 (2000) . 

361 Moba v. Diamond Automation. 325 F.3d 1306 (Fed. Cir. 2003). 

362 /rf.at 132 1 A 

FPS's [Food Processing Systems/Mob a' s3 contention that the '505 patent does not 
adequately disclose lifting eggs from a moving conveyor merely revives its 
noninfringement argument in the cloak of a validity challenge. As noted, the jury 
found that one of skill in the art would discern possession of the invention at the 
time of filing, a finding supported by substantial record evidence. Therefore, the 
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th e, familiar wording of the test for compliance- _the specification must convey to one 
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without priority issues." 366 Lilly, Enzo_ L amt^ffoech si were 



proposition, that t he disclosure need not take any particular form, so long as 



possession by the inventor is demonstrated. 367 The JCAFC in Moba concluded that/ 
"accordingly," the jury finding in the lower court with respect to the written 
description requirement of the first paragraph of 35 U.S.C. $ 11^ was .supported by ^ 
"substantial evidence 



Id 



... trial jcourt .correctly .determined that claim .24 ia not. invalid far Jack .of .adequate . 
t .vv^texLdescrip tiniL 



1 



1 Id. at 1315 . The court- discussed the construction of claim 24 of the '505 patent: 

The district court correctly construed the "holding stationl of claim 24 of the 
'505 patent as *a first location in space to which an egg is moved and at which an 
egg may maintain position until the egg is lifted simultaneously with an egg at a 
'8paced;apart location.' Nonetheless FPS argues that the district court's 
construction requires that an egg cease motion before the lift to the overhead 
conveyer. The claims simply do not require a specific temporal limitation 
associated with the term holding . . . .* Moreover, the ordinary meaning of "to 
hold" is *to keep in position, guide, control, or manage.I_This meaning also 
imposes no requirement that an object remain stationary,. 
Id. (citations omitted). 

364 Id. at 1320 ("The test for compliance with § 112 has always required sufficient information 
in the original disclosure to show that the inventor possessed the invention at the time of the 
original filing / (citations omitted). - ). 

3® Id. at 1319 ("Federal Circuit case law reflects two applications of 35 U.S.C. § 112, \ 1. First, 
in 1967, this court's predecessor inaugurated use of § 112 to prevent the addition of new matter to 
claims. '" (citations omitted) ). . 

368 Id. at 1320 ("The second application of the written description requirement is reflected in 
Regents of University of California v. Eli Lilly & Co. There, this court invoked the written 
description requirement in a case without priority issues." (citations omitted)) . 

367 Moba. 325 F.3d at 1321. In discussing Enzo land Amcen. the court noted: 

In Enzo and Amgen, the record showed that the specification that taught one of 
skill in the art to make and use an invention also convinced that artisan that the 
inventor possessed the invention. Similarly, in this case, the Lilly disclosure rule 
does not require a particular form of disclosure because one of skill could 
determine from the specification that the inventor possessed the invention at the 
time of filings 
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* Id. The court- diaciisaed the juries finding^ 
Accordingly, substantial evidence supports the jury's finding that the '505 patent 
is not invalid for lack of an adequate written description. The '505 patent 
specification describes every element of claim 24 in sufficient detail so that one of 
ordinary skill in the art would recognize that the inventor possessed the invention 
at the time of filing. 



Deleted: v. Diamond 
Automation... 1306, ... (Fed. Cir 
2003).7a at 1321 (". f 



I2i£ 



Inserted: . . In discussing Enzo I 
and Amgen, the court noted" 



Formatted: Block Text (FN), Left, 
Indent: First line: 0", Une spacing: 
single 



■| j Deleted 



, first paragraph, 



$ { Deleted: ."). 



Deleted: at__. 



Deleted: ...(" 



T3261 ] 



Formatted: Block Text (FN), Left, 
Indent: First line: 0", Une spacing: 
single 



t * t ^- '-■-- { ueieteq: 

[1 : 100 2002] John M^rshall Review of intellectual -Pro^rty-Law -\\ 5f Deleted? 



\V\>. 1 Deleted: (" 

\\\\\ ; — 



In his concurring opinion, Judge Rader admonished that £&FQ case _law 
expanding. . the. . . "wxittejx . .description'.'. . requiremejpLt ..beyond . .a . .means . .of . "priority. J 
protection" made "no sense." 369 According to Judge Rader, a disclosure that enables a < 
person skilled in the art to make and use the invention shows "possession of that 
invention" under the first paragraph of 35 U.S.G. "^"ll^? 7 ?: "He -criticized 
dildos^Yeq^ 
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In 1997, this court inexpHcitly wrote a new disclosure requirement, 
found nowhere in title 35, and attributed that new requirement to the 
written description doctrine. Thi5-newiiisclosure doctrine; -applied so far 
only to biotechnology cases, requires a nucleotide^by^nucleqtide Recitation of 
, the structure of a bio technological invention. ^ronicaUy, this ^urt] could 
ha# £ xeachsd the .same .result in Lilly, without making .a naw .disclosure xule . . 
Under the- statufce^s-enablement rule r -this-court- would- have -also- detennined- 
that the invention was not sufficiently disclosed. Instead, this court 
presumed to create another doctrine for sufficiency of disclosure. Although 
characterized as a written description doctrine, the Lilly rule cannot in fact , _ 

trace its origin to the statute or any prior case^ 3 ? 1 Inserted: 

Jfoechs£ and Enzo^I were interpre te d as a "decline" of the "Lilly rule" by 
explicitly recognizing that not all "functional descriptions of genetic material 
necessarily [faill as a matter of law to meet the written description requirement," and 
that "deposited material satisfies the Lilly standard if it meets the enablement 
standard."* 72 
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369 Id. at 1322=;23 (Rader, J M concurring). 

Specifically this court n contrary to the statute and its own thirty;year body of 
case law - applies the written description doctrine beyond the purpose for which 
the doctrine was created, namely priority protection. By making written 
description a free;standing disclosure doctrine, this court produces numerous 
unintended and deleterious consequences. 

Under Federal Circuit case law, FPS [Food Processing-Systems/Moba] asked this 
jury to decide that the patent's disclosure can enable a skilled artisan to make and 
practice the invention, but still not inform that same artisan that the inventor 
was in possession of the invention. Puzzling . . . The Lilly doctrine simply makes 
no sense in this context. In fact, outside its proper context of policing priority, it 
never makes sense but compounds the confusion, increases the chances for error, 
and augments the expense of the trial process. 

™ld. at 1323. 

The language of § 112, % 1, indicates that a patent will contain an adequate 
description if it provides enough information to enable a person skilled in the art 
to make and use the invention. Any disclosure that enables one to make and use 
the invention also, by definition, also shows that the inventor was in possession of 
that full invention. Consequently, the erroneous written description requirement 
of [the] Lilly case lacks both a statutory and a logical foundation, 

371 Id. at 1324 (citations omitted) . 

™ Id. at 1326. Judge Rader discussed the Z,///r rule: 
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Judge Rader concluded that, nevertheless, the doctrines of "written description" 
and "enablement" remain and overlap, suggesting that this will be the basis for**- 
future confusion: «■ 



With some understanding of the difficulties and redundancy of the 
Lilly rule, the fiAFG has ^€[un to convert .it .into the enablem 

w ^h a^iffere^ that lea ve s trial .courts in^ the fix that 

the__iriaLxourt_.foc^ 

doctrines -with- apparently- overlapping -requirements;-- - After all,- ^taenabieis--- ■ - - 1 
to show possession^and'to'shw^ 



In Judge Bry son's concurring opinion, he stated that he^did not '^beUeye that 
LUIy. .constituted, a . departure . from . prior. law. .when, it .applied, a. written . description, j 
requirement in a non^riqrity .context " 3 ^ 1 
interpretation of i?uscA2^ as imposin^^ __ 
for "the purpose "of ¥sta^lisHng" priority.^ 

that recent case law, including Lilly, has misinterpreted 35 U.S.C. & .112 - * ~ 
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Perhaps the entire line of cases stemming from Ruschig is wrong, and 
perhaps we should at some point address that question en banc. I take no 
position on that issue at this juncture. I think it is worth pointing out, 
however, that the real question raised by Judge Rader's statutory analysis 
is not whether Lolly was an unwarranted departure from the Ruschig line of 



^Fortunately, the viability of the Lilly rule is on the decline. After Enzo, 



Deleted: Id. 



llf^ L^? atted: Itallc 

Deleted: m 
[|$$ | Inserted: 
Deleted: ■ 
|lf | Deleted: ■ 
Deleted 



Id 



Id 



this court recognized ^that Eli Lilly did not hold that all functional descriptions of 
genetic material necessarily fails as a matter of law to meet the written 
description requirement, rather, the requirement may be satisfied if in the 
knowledge of the art the disclosed function is sufficiently correlated to a 
particular, known structure. 1 

In this case, as in Enzo, the court explained that the written description 
requirement is - satisfied when "one of skill in the art would discern possession of 

the invention at the time of filing.! Indeed, the Enzo court struggled to 

distinguish the soxalled written description requirement from enablement. In 
reversing its original decision that deposits of biological material do not satisfy 
the written description requirement, the Enzo panel cited cases that found that 
such deposits satisfy the enablement requirement. In other words, because Lilly 
did in fact compel the result of the original Enzo panel, the court on 
reconsideration had to concede that deposited material satisfies the Lilly standard 
if it meets the enablement standard , (citations omitted), 

373 Moba. 325 F.3d at 1326 (Rader. J., concurring). 

374 Id. at 1327 (Bryson, J., concurring). 

375 Id. In contesting Judge Rader'a opinion. Judge Brvson noted- 

The problem, as I see it, is that if it is correct to read section 112 as containing a 
separate written description requirement, it is difficult to find a principled basis 
for restricting that requirement to cases involving priority disputes. There is no 
language in section 112 that would support such a restriction . . . ^ 
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| The question of whether a patent specification enables one skilled in the art 
either to select a claimed invention from broad teachings, to comprehend the scope of 
an invention as broadly claimed from specific embodiments taught, or to understand 
an invention as claimed to be equivalent to the language of the specification, predates 
the Patent Act of 1952. Even after 1952, such inquiries have been interpreted 

| variously unde r 35 U.S.C. §g 1 12 and 132 or without any statutory basis. 

Ruschig was not the first case under the Patent Act of 1952 to assess sufficiency 
of support for claim scope, regardless of reliance on the first paragraph of 35 U.S.C. § 
11^. The facts in Ruschig are sijnilar to those in prior case s, including Jetton, ^ 
discussed abov e, Ag in Rusdbig t the court in Prutton _ held that claims to a specific ^ 
compound are not supported by a general description of a class of compounds. 377 
Prutton was decided in 1956, well before Ruschig, and did not make specific reference 
to 35 U.S.C. § 112. However, Rainer K which ^also was decide^ 
also like Ruschig, addressed support for claims directed to use of particular materials 
in light of a broad disclosure, does make reference to 35 U.S.C. § 112. 379 

Contrary to Judge Rader's assertion, 380 sufficiency of a_ written description was 
addressed separately, or at least distinguished from enablement, in cases decided 
after Ruschig and before Lilly and. EnzoJ, without reference to priority. The court in 
In re Robins,™ 1 which was not listed in Judge Rader's appendix in EnzoJ of "written 
description cases," for example, appeared to address inclusion of representative 
examples to specifically support generic language as a n issue distinct from priority 
and dealing with the first paragraph of 35 U.S.C. T § 11$ 
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Mention of representative compounds encompassed by generic claim 
language clearly is not required by § 112 or any other provision of the 
statute. But, where no explicit description of a generic invention is to be 
found in the specification (which is not the case here) mention of 
representative compounds way provide an impHcit description upon which" 
to base generic claim language ... It also has _bee^_9ne way^ of teachmg___ 
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are id. at 1328. 

377 See eeneraUvVrutton v. Fuller. 230 F.2d 459 (C.C.P.A 1956). 

379 } n re Rainer. 347 F.2d 574, 575 (C.C.P.A. 1965) ("As a basic proposition we note that section 
112 necessarily requires us to determine what 'the invention' is and Patent Office Rule 71(b) 
requires us to go further and determine the 'precise invention' for which the patent is solicited 1 ^ 
See also supra Section III.C. 1 discussing Rainer. 

380 See supra Section III.C. 3 . 

38» In re Robins. 429 F.2d 452 (C.C.P.A. 1970). 
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how to make and/or use the claimed invention, thus satisfying that aspect of 

§ 112.882 

Judge Rich, for the court in Robins, held that the specification of the patent 
application met the requirements of the first paragraph o f 35 U.S.C. & .11 2, including: 
"a statement of appellants' invention, which is as broad as appellants' broadest 
claim s" and " a sufficie ncy of the specification to" satisfy "the l bes"^ mqde' requirement of §" ~ 
1 12 and to.ejo&hlo.ojo^.skiUediu^-^ 

is claimed." 383 There is no discussion in Robins whether the claims were amended 
after filing to include the language at issue. 

The Court of Customs and Patent Appeals in DiLeone™* cited Robins, and, like 
Robins, was an appeal from a decision by the JBoar^, Accoro^g to the 
DiLeone, "Xtjhe sole issue is whether the specification satisfie s the description 
requirement of the first paragraph of 35 U.S.C. § 112, with respect to claims of the 
breadth sought here." 386 The enablement and description requirements were 
explicitly partitioned by the court. 386 The court in DiLeone cited Ahlbrecht, as a 
specific example of a case wherein the description requirement of 35 U.S.C. § 112 was 
not satisfied with respect to a class of compounds, despite the fact that the same class 
was enabled. 388 Although Ahlbrecht was a "priority" case cited by Judge Rader in his 
dissent in Enzo_f, the court in DiLeone did not refer to priority when it relied on 
Robins and Ahlbrecht to state that an invention may be enabled but not described. 389 
To the contrary, the court in DiLeone explicitly stated that the claim language at 
issue "appeared in the originally^JBled : claims 

Further, and also contrary to Judge Rader's concurring opinion in Moba, the 
written description requirement as set forth by the AjFC in Lilly was not new, but 
instead followed legal precedent. For example, the Lilly court cited Fiers in support 
of its statement that, even in the case of cDNA, an adequate written description 
"requires a precise definition, such as by structure, formula, chemical name or 



11 



382 Id. at 45 6--5T (emphasis added) (citations omitted) . Footnote 8 of Robins further states: 
"In Sus, the rejection was based on the second paragraph of § 112. For the reasons given in In re 
Hallack,', In re Borkowskh and In re Wakefield, such rejections are more properly considered under 
the first paragraph of that section." Id. at 457 n. 8 (citations omitted). 

383 I± at 456. 

3»* In re DiLeone. 436 F.2d 1404 (C.C.P.A. 1971). 
385 Id at 1405 (emphasis added) , 
aw See supra gectipn HT.C.l . 

388 Diteomi 436 F.2d at 1405 ("In In re Ahlbrecht^^ decided January 7, 1971, we held that 
the description requirement had not been satisfied as to a claimed class of esters, even though the 
specification might have indirectly enabled one skilled in the art to make and use the entire class." 
(citations omitted)). 

389 Id. ("It is clear from Robins and Albrecbt that it is possible for an invention to enable the 
practice of an invention as broadly as it is claimed, and still not describe that invention"). 

390 Id. at 1406 ("Moreover, we note that the expression in question [a different dianhydride of 
an organic tetracarboxylic acid] appeared in the originally filed claims."). 



Deleted: 


) 


{ Deleted: - 


) 


{ Deleted: 67 ~] 


Deleted: n.8 


) 


Deleted: Bobbins 


) 


Inserted: b 


Deleted: 422 F.2d 911, 57 C 


C.P.A. 


(1970) 




, Deleted:, 422 F.2d 904, 5^7 


. T3351 


.( Deleted: , 422 F.2d 897, 5^ 


. r3361 ] 


,{ Deleted: Id. at 457 Elimif7 


• T3371 ) 


\ Formatted: Font: Not Bold 


) 


( Deleted: ). J 


•,[ Formatted: Font: Not Bold 


( Formatted: Font: Bold 


J 


{ Deleted: ] 


(Deleted: 


J 


'.fFormatted: Font: Not Italic 


) 


{ Deleted: 436 F.2d ] 


I Formatted: Font: Italic ] 


(Deleted: 


) 


(Deleted: text at note 173 _J 


fFormatted: Font: Bold ] 


foeleted: 


) 


\ Deleted: . 


) 


(Formatted: Font: Not Italic 


) 


( Deleted: , 435 F.2d 908, ] 


( Formatted: Font: Century J 


I Deleted: 


) 


( Deleted: the ] 


( Deleted: J 


| Deleted: ] 


| Deleted: Patent 


) 


{ Deleted: of Appeals' 


) 


{ Deleted: T 


) 


(Deleted: a" 


) 


[Deleted:- ] 


{ Deleted: court 


) 



[ 1 : 1 00 2002] John Marshall Review of Intellectual Property Law ^, 6: 



: iv 

physical properties." 391 Both the Lilly and EnzoJ courts, as in many earlier cases, V. 4 \ ^ 
relied upon, or drew parallels with, enablement considerations in their ^ '\\\ 
determinations of satisfaction of the written description require^ejoi....^e.couiiiix:\ %\ 
Lilly stated that a genus claim, even of genetic material, majt-be— adequately-^ '<V\\ 
supported by -a written-description-^ % 
written-deHc ription may be achieved by means "analogous to enablement of a genus ffi: 'A\ 
under the first paragraph of 35 U.S.C. 5 11 % by showing the enablement of a yfc \Y 
representative number of species within the genus." 392 Similarly, and as noted by v 
Judge Ra der,- the -holding- by- the^jazs^courVthat-pubUG-^^ 

genetic* material- is sufficient- to meeVthe-written^description^quirement^i^-aHeast 
imphcitiy- sugge sts^- connection between the written description requiremeriTaiTd^ 
enablement by one stilled in the art. 893 
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As discussed above, the £AFC in ^Rocheste r fff 94 a ffirmed a lower court 
decision 395 holding that claims directed to a method of inhibiting prostaglandin H 
synthase^ (PGHS fcjS, or C03Q,2) activity in a human host by administering a 
no n-steroidal compound which selectively inhibits PGHSi2 gene product were invalid ' 
for failure to meet the written description requirement under the first paragraph of 1 
35 U.S.C. $ 112. 396 The .CAFC did not reach a decision with respect to the lower 
courts' holding that the same claims also were invalid for lack of enablement under 
35 U.S.C. § 112.89? 

As had many cases since Ruschig, the analysis by the AFC stemmed from the 
premise that the first paragraph of 35 U.S.C. § 11 3, separately includes a "written 
description requirement," an "enablement requirement/' and a "best mode 



391 Regents of Univ. of Cal v. Eli Lilly & Co.. 119 F.3d 1559, 1566 (C.A.F.C. 1<)97), The court 
ackriowledeeri that: 

An adequate written description of a DNA, such as the cDNA of the 
recombinant plasmids and microorganisms of the '525 patent, "requires a precise 
definition, such as by structure, formula, chemical name, or physical properties,! 
not a mere wish or plan for obtaining the claimed chemical invention, 
M (citations pmitted). 

392 Id at 1569. The court further noted that: 

A description of a genus of cDNAs may be achieved by means of a recitation of a 
representative number of cDNAs, denned by nucleotide sequence, falling within 
the scope of the genus or of a recitation of structural features common to the 
members of the genus, which features constitute a substantial portion of the 
genus. This is analogous to enablement of a genus under § 112, P 1, by showing 
the enablement of a representative number of species within the genus.. 

Id 

393 Moba v. Diamond Automation. 325 F.3d 1306. 1326 (Fed. Cir. 2003) (Rader, J., concurring) 
("In other words, because Lilly did in fact compel the result of the original Enzo panel, the court on 
reconsideration had to concede that deposited material satisfies the Lilly standard if it meets the 
enablement requirement."),. 

394 Rochester 11. 358 F.3d 916 (Fed. Cir. 2004). 

395 Rochester L 249 F. Supp..2d 216 (W,D A NX 2003). 

396 Rochester II 358 F.3d at 930. 

397 Id. at 929=130 ("In view of our affirmance of the district court's decision on the written 
description ground, we consider the enablement issue to be moot and will not discuss it further."). 
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require ment." 898 The ^|^employfed the .effimjjle of DiL^pne, .discussed above i , to 
state that "an invention may be enabled even though it has not been described " and 
that, conversely, a "specificati on caa likewise describe an invention, without enablin g V. 
the practice of the full breadth of its claims." 400 The JCAFC also recited the holdinsin 
Ruschig that, despite broad enablement by the specification, the specific compoV 
claimed was not teught by the specification. 401 

Extrapolating the logic of RuscEig, the jQAFy^quoted ..^E^o^_and stated j^at, 
while claimed subject matter need not be described in haec verba, the written 
description requirement "must still be met in some way so as to 'describe the claimed 
invention so that one skilled in the art can recognize what is claimed."' 402 Using an 
analogy, the fiAFC stated that use of the word "automobile," as that label would be 
interpreted in the nineteenth century, would not describe a "newly invented 
automobile," without further including in the description components of the claimed \ 
invention- 

Similarly, for example, in the nineteenth century, use of the word 
"automobile" would not have sufficed to describe a newly invented 
automobile; an inventor would need to describe what an automobile is, viz., 
a chassis, an engine, seats, wheels on axles, etc. Thus, generalized 
language may not suffice if it does not convey the detailed identity of an 
invention. In this case, there is no language here, generalized or otherwise, 
that describes compounds that achieve the claimed effect. 403 

this statement, the £AF C a in effect, imposed a categorical requirement of 
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physical identity without specifying what or how much identity is required. In ^ 
particular, using the jCAFQ's analogy, no explanation is provided for detennining the 
limits of "viz." and "etc." Further, the JQAFC, appears to overlook that subject matterA 
is defined by how it is claimed, regardless of whether the combination of, for example, \ 
a chassis, an engine, seats, and wheels on axles reads on what is commonly known as * 
an automobile or some other device, such as a golf cart, a crane o r^ a locomotive. The 
M CAFQ also does not address the fact that a single we Unrecognized device, such as an 



automobile, can be defined in different ways which may not overlap. For example, 



398 Id. at 921. 

*«> Kochester II. 358 F.3d 916. 92 1-22 (Fed. Cir. 2004) -(citations omitted) . 

401 Id. at 922 . When looking to Jn re Ruschis. the court stated' 

In reaching its decision, the court observed that the claimed compound was not 
described in the specification and would not ^convey clearly to those skilled in the 
art, to whom it is addressed, in any way, the information that appellants invented 
that specific compoundl .... It did not teach the specific compound. 
Id, (citations omitted). 

402 Id at 923. The Rochester J[ court, however, quoted only the latter portion of the sentence. 
The complete sentence from EnzoJ specifies that, even where the language of claims is supported, 
the specification, "to the extent possible," must describe the claimed invention so that one skilled in 
art can recognize what is claimed. Enzo -/. 323 F.3d 956. 968 (Fed. Cir. 2002). "Even if a claim is 
supported by the specification, the language of the specification, to the extent possible, must 
describe the claimed invention so that one skilled in the art can recognize what is claimed. w ^/tf 
(emphasis added). 

«3 Rochester II. 358 F.3d at 923. 
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«h Jeoaon v. Coleman. 314 F.2d 533 (C.C.P.A. 1963). 
« Rochester K 358 F.3d at 923. 

See ggaeraZ/r Jepson v. Coleman. .314 F.2d 533 (C.C.P.A. 1963). 

Rochester II. 353 F.3d at 923 (" It is not a question whether one skilled in the art might be 

able to construct the patentee's device from the teachings of the disclosure of the application. 

Rather, it is a question whether the application necessarily discloses that particular device." 

(citations omitted)) . 

408 Jepson. 314 F.2d at 536 . The cou it stated: 

Unquestionably appellees in their specification accurately and concisely 
disclose each and every feature of their preferred embodiment and in the so;called 
'critical paragraph' herein quoted, they negatively disclose a different blanket 
than described as their preferred embodiment. That different blanket may or may 
not have all the features of appellant's device [as claimed]. Certainly it could, but 
is that sufficient to satisfy the law on this subject? We think not. 

Id. (emphasis added). 

4,1 In re Moore. 155 F.2d 379. 382 (C.C.P.A. 1946) ("We are of the opinion thatrdaims 2 and 3 
are broader than the disclosure in appellant's application and that they were properly rejected for 
that reason."). 

412 Id. The court noted that: 

[Alppellant's application is limited, as stated by the Primary Examiner and as 
hereinbefore noted, to so-called 'fumigants.l and there is nothing in the 
application to indicate that appellant's composition would kill insects when 
applied in either solid or liquid form. On the contrary, appellant states in his 
application that his alleged ^invention in its broadest aspect is concerned with the 
discovery that all members of the generic class of monosubstituted acetonitriles 
which have a boiling point below 200 [degrees] C. are useful as fumigants.l 
Id. (emphasis added in Moore). 



Deleted: 50 C.C.PA 1051, 314 
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one skilled in the art may recognize an automobile defined by the combined' features* 
of a steering wheel, a windshield, a transmissio n^ and a speedometer, as opposed to 
the features cited in the analogy. 

The JCAFC relied on Jepson v. CoJeman* 04 to state that, teven-prior->to-Jgu^c^ 
"our - predecessor - court- -explicitly - rejected- - the - notion- that - an- enabling - disclosure- -j 
necessarily ^alxsfieH-the'^ridfen^e^rtption re T quirement. n405— Howe"Vei 
directed to sufficiency of a specification to entitle a senior party in an interference i 
make claims, and nowhere mentions 35 U.S.C. § 112 jQS A quotation taken from 
Jepson by the Rocheste rJl court, requiring that the "application necessarily" disclose 
the particular device, 407 refers to the issue in Ruschig and other cases before and 
after Ruschig, wherein the specification was found not to adequately direct one 
skilled in thwart to select a ^ 1 
disclosure.^ 

In Te-Moorqan& Jn j& Su $ were -also. cited-by-the-P AF-C.- in- JtochesterJc to -support^ 
the position that a written de Scrip tiori-rep^uirenlei^^ 
In Moore, the claims were broader than the invention as taught in the 
specification. 411 The court in Moore, in fact, referred to enablement in conjunction 
with establishing what one would understand to be appellant's invention. 412 
Similarly, Sus related to breadth of the invention, as claimed, given teac 
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only specific embodiments would be suitable^for a particular purpose.* 4 t9 ^~~§us, like. 
Moore, ' "a lso' " at * leasiT " "periphe rally* ' "considere d " "enablement " Tn" * "conjunc'tixSh* * " with" 

sufficiency, of .oUs.clo.sure,?. 1 . 4 . r * 

- qgithe^ . -Jepso^ - $T: : : W A?: : statutory _- minimum- regj^ement- -for- 
wTitteif-descrirJtron^e^fert4r^^ 

art to u^deYstand hov^ to rfake anf u T se the invention. To^e~rontra~ryr^^ 
cases addressed enablement when determining sufficiency of The written descripti< 
of a specification^ ^ Moreover, as discussed supra, where adequate support for a 
claimed invention was at issue, there are other cases predating Ruschig that did not 
rely on'-a- distinct -statutory -requirement- for- a - written- description - thatf - was- eep j 
from enablement under 35 U.S.C. § 112; some relied on Rule 71(b) 416 or did not 
explicitly rely on any statute or rule. 417 Prior to Ruschig, sufficiency of written 
description was not considered to be a statutory requirement separate from 
enablement. 

The Rochester ^ court provided further historical support for a separate written 
description requirement by reciting the Supreme Court case of Evans, 41 * which was 
decided under the Patent Act of 1793. According to the Rocheste r II court although 
the patent statute has changed "extensively" since 1822, it was "not very different in 
its articulation of the written description requirement." 419 As discussed above, 
contrary to the statement by Judge Lourie in Rochester^J L the language of the 
written description requirement is very different in the Patent Act of 1952 than it 
was under the Patent Act of 1793, if for no other reason than because of changes that 
were made with respect to requirements of the written description 



The Rochester Jlcouxt also interpreted the earlier cases of Hers, Lillys and Enzcj 
£ which the University of Rochester attempted to distinguish as being limited to 



413 In re Sus, 306 F.2d 494, 504 (C.C.P.A. 1962). The court reasoned: 

Thus, it seems to us that one skilled in this art would not be taught by the 
written description of the invention in the specification that any "aryl or 
substituted aryl radical" would be suitable for the purposes of the invention but 
rather that only certain aryl radicals and certain specifically substituted aryl 
radicals would be suitable for such purposes^ 
Id. (emphasis added). . . * 

414 Id. at 1316 n.7 ("We question also whether all 'aryl and substituted aryl radicals' would 
produce light;sensitive aromatic azides insoluble in water but soluble in organic solvents as is 
required by the invention disclosed in the specification,!). 

»* Jepflon v. Coleman. 314 F.2d 533. 536 (C.C.P.A. 1963).; Moore. 155 F.2d at 382: Su& 306 
F.2d at 497. 

"6 See, e.g., In re Gay, 309 F.2d 769 (C.C.P.A 1962): In re Rainer, 347 F.2d 574 (C.C.P.A. 
1965), 

4 " See, e.g., Prutton v. Fuller, 230 F.2d 459 (C.C.P.A. 1956), 
«s Evans v. Eaton. 20 U.S. 356, 433-34 (1822). 

419 Rochester H. 358 F.3d 916. 925 (Fed. Cir. 2004) ("Although the patent statutes have been 
extensively revised since 1822, most notably in the addition of the requirement of claims, the 
language of the present statute is not very different in its articulation of the written description 
requirement."). 

422 Rochester II 358 F.3d at 925 ("Rochester also argues that Fiers v. Revel_ Lilly and Enzo 
are all distinguished because they were limited to DNAibased inventions. 1 * (citat ion omitted) ). 
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DN Ari>ased inventions. 422 Although the court acknowledged that these cases all 
related to genetic material, it refused to so limit application of the statute under 
these cases .^ , The j CAF Q then recited guidelines adopted in EnzqJL regarding 
satisfaction of the written description y 



In Enzo, we explained that functional descriptions of genetic material can, 
in some cases, meet the written description requirement if those functional 
characteristics are "coupled with a known or disclosed correlation between 
function and structure, or some combination of such characteristics." 425 



Formatted: Font: Century, 8.5 pt 



Formatted: Indent: First line: 0.3" 



Formatted 



Deleted: {quoting Enzo, 323 

F.3d at 964 (which quoted the PTO's 
Guidelines for Examination of 
Patent Application under the 35 
U.S.C. 112,^1 1 "Written 
Description" Reqm't, 66 Fed. Req. 
1099, 1106)). Get rid of 



However, the JCAFC1 recited only a portion of the written description guidelines 
recited in Enz qJ, The portion of the written description guidelines quoted by Enzi 
was not limited to functional characteristics coupled with correlation between 
function and structure as an alternative to a complete description of structure. The 
complete quote taken from ifoz alprovides for additional alternatives- 



jn its .Guideline s, the PTO has deternTy^ed that the writte n descrip tion « 

requirement can be met b y^ghowqn& that an invention is complete by 
disclosure of sufficiently detailed, relevant identifying characteristics . . . 
i.e., complete or partial structure, other physical and/or chemical properties, 
functional characteristics when coupled with a known or disclosed 
correlation between function and structure, or some combination of such 
characteristics ^ 426 

Further, the analysis applied in Enzo I was not limited to nucleic acid 
sequences; it also included a determination of the functional characteristics of 
preferential binding of claimed antibodies in combination with the structural 
characteristics of known classes of antibody. As stated by the court in Enzo J- 

For example, the PTO would find compliance with § 112,,P_1, for a . claim to « 

an "isolated antibody capable of binding to an antigen X," notwithstanding 
the functional definition o f the antibody, in light of the ^well defined 
structural characteristics for the five classes of antibody, the functional 
■characteristics of antibody binding, and the fact that the antibody 
technology is well developed and mature/! . . » Thus, under the Guidelines, 
the written description requirement would be met for all of the claims of the 
'659 patent if the functional characteristic of preferential binding to N. 
gonorrhoeae over N. meningi tides were coupled with a disclosed correlation 
between that function and a structure that is sufficiently known or 
disclosed. We are persuaded by the Guidelines on this point and adopt the 
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<25 Jd. ("We agree with Rochester that Hers. Lillv and Enzo differ from this case in that they 
all related to genetic material whereas this case does not, but we find th«t distinction to he 
unhelpful to Rochester's position. It is irrelevant; the statute applies to all types of inventions,''). 

™ Enzo I. 323 F.2d 956^964 (Fed. Cir. 2002) (emphasis added). 
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PTO's applicable standard for determining compliance with the written 
description requirement. 427 

The jCAFC in Rochester II appeared to be much more restrictiye in its 
application of the USPTO Guidelines than was tiie^QAFC^in EnzoJ. Spec^caUy^ the 
Rochester II court employed an example of an application of the USPTO Guidelines 
wherein complementary strands of nucleic acids could easily be deduced from any 
given strand of DNA or RNA, and then stated that, in contrast, even providing 
tJiree^dimensiqnal structure s of enzymes may not be sufficient \ 



PAY®?. th®s^^^ T$ 

become a routine matter to envision the precise sequence of a 
"complementary" strand that will bind to it. Therefore, disclosure of a DNA 
sequence might support a claim to the complementary molecules that can 
hybridize to it. 
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The same is not necessarily true in the chemical arts more generally. 
Even with the three^&mensional structures of enzyme s such as COX^ 1 and 
COX^2 in hand, it may even now not .be. within the ordinary^ skill in the art 
to predict what compounds might bind to and inhibit them, let alone have 

been within the purview of one of ordinary skill in the art in the 1992^1995 | 

period in which the applications that led to the '850 patent were filed. 
Rochester and its experts do not offer any persuasive evidence to the 
contrary. 428 




The University of Rochester also distinguished Fiers, Lilly x and EnzoJ as being 
limited to composition of matter claims. 429 The £AFC dismissed _tjbis as a semantic 
distinction and stated that, either by actual or constructive reduction to practice, the 
"specification must teach the invention by describing it." 430 The £AF^further stated . 
that, absent identification in the patent specification of any compounds by the 
disclosed assays, the claimed methods of their use cannot be practiced. 431 Lilly was 
relied upon by the £AFC to state that, without identifies : tion of compounds that 
selectively inhibit PGH^.2, the specification represents a ."mere wish or plan for 
obtaining the claimed invention": 



As pointed out by the district court, however, the '850 patent does not * 
disclose just "which 'peptides, polynucleotides, and small organic molecules' 

have the desired characteristic of selectively inhibiting PGH^2." T Withqut 

such disclosure, the claimed methods cannot be said to have been described. 

427 Id (citations omitted). 

™ Rochester 11. 358 F.3d 916. 925 (Fed. Cir. 2004) . 

429 Id. at 926 ("Rochester also attempts to distinguish Fiers, Lilly, and Enzo by suggesting that 
the holdings in those cases were limited to composition of matter claims."). 

430 Id ("We agree with the district court that that is 'a semantic distinction without a 
difference' .... The specification must teach the invention by describing it." (citations omitted)) . 

431 Id at 927 ("[I]t is undisputed that the '850 patent does not disclose any compounds that can 
be used in its claimed methods. The claimed methods thus cannot be practiced based on the patent's 
specification, even considering the knowledge of one skilled in the art."). 
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As we held in Lilly, "an adequate written description of a DNA . . . 'requires 
a precise definition, such as by structure, formula, chemical name, or 
physical properties,' not a mere wish or plan for obtaining the claimed 
chemical invention." jFor reasons stated above, that requirement applies 
just as well to no n^DNA (or [nonkRNA) chemicai inventions, 

,The JOAFC in Fiers, Lillxand EnzoJ employed analyses that incorporated lack of 
certainty in methods for isolating a gene of interest or lack of knowledge of the amino 
acid sequence of the protein encoded by the gene. 433 Further, the £AFC in Fiers 
relied on Amgen for support in discussing the inadequacies of a description that 
represents merely a "wish . , . or a plan": "As we stated in Amgen and reaffirmed 
above, such a disclosure just represents a wish, or arguably a plan, for obtaining the 
DNA." 434 

The Amgen court, in turn, like Fiers, Lilly and EnzoJ[^m^ 
analysis: "Based on the uncertainties of the method and lack of information 
concerning the amino acid sequence of the EPO protein, the trial court was correct in 
concluding that neither party had an adequate conception of the DNA sequence until 
reduction to practice had been achieved . . . ." 435 

^n e ach of Amgen, Fiers } Lilly, .and EnzoJL, sufficiency of the description of the 
invention hinged, at least in part, on lack of predictability of the methods described ' 
to obtain the claimed genetic material, thus necessitating either identification of the 
nucleotide sequence or a publicly accessible deposit. Identification of a structure of a 
compound was required where the methods provided were not sufficiently certain to 
enable the skilled artisan to isolate or identify the claimed gene, or where a publicly 
accessible deposit had not been made. 

The University of Rochester relied on Union Oil Co. v. Atlantic Richfield Co*ff 
.[hereinafter Unocal as legal precedent to support claims describing a composition by 
"desired characteristics" rather than "exact chemical components." 487 The Rochester 1 
II court, in response, distinguished the facts of Unocal by stating that, unlike that 
case: "Rochester did not present any evidence that the ordinarily skilled artisan 
would be able to identify any compound based on its vague functional description as a 
'non-steroidal compound that selectively inhibits activity of the PGHS fcj gene 
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product.'" 438 

Without questioning the discovery of the inventors or the- ability of the assay to 
distinguish between PGHi^l and PGH %;2 inhibitors, the Rochester II court further 
stated that, in the absence of novelty of any compounds identified by the assay, the 
claims of the '850 patent would not be novel. 439 In dicta, therefore, the £AFC 



432 Id. (citations omitted). 

433 See supra Sect ion 111.(12 for discussion of Fiers, Lolly, and EnzoJ. 
Fiera v. Revel. 984 F.2d 1164. 117 1 (Fed. Cir. 1993) . 

<» Amgen v. Chugai , 927 F.2d 1200, 1207 (Fed. Cir. 1991).. 

«*36 Union Oil Co. v. Atlantic Richfield Co.. 208 F.3d 989 (Fed. Cir. 2000). 

437 Rochester f I 358 F.3djU_926 1 

-•as Id. at 928. 

439 Id. at 928 n.7 . The court noted that- 

Indeed, if compounds that selectively inhibit activity of the PGHS;2 gene 
product had been known in the art, it is difficult to see how the claims of the '850 
patent would have satisfied the novelty requirement of 35 U.S.C. § 102. After all, 
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effectively "bai^d'patentability of The "claimed: "therapeutic" 'me't hod unless 'Compounds' 
identified by the assays would n pt only selectively inhibit PGri£^2 activity, but be 
novel afi.wei fc 



Two- other- - cases,- -In -re- Echvards 4 ^ - and- -In - re- HerscMer^- -relied -upon - by- -the- 
SpecificallyV "the "JAF'C stated thatrwith respect to Ed ward's' i "tla.e "written descnption 



requirement for a cclaimed compound was satisfied by teaching a method to make the 
compound: 

In Edwards, the court held that the written description requirement was 
satisfied by a specification that described a claimed compound by the 
process by which it is made, rather than by its structure, because the court 
found that Edward's application, "taken as a whole, reasonably leads 
persons skilled in the art to the [recited reactions] and, concomitantly, to 
the claimed compound." 443 

According to the RochesterJI court, the specification provided by the University 
of Rochester provided no method for making "even a single non-steroidal compound 
that selectively inhibits activity of the PGHS L2 gene product." 444 



However, the issue in Edwards was not, as described by the jpAFC in Rochester 
II t whether "the written description requirement was satisfied by a specification that 
described a claimed compound by the process by which it is made, rather than by its 
structure . . . ," 445 On the contrary, the .CAFC accepted that a claimed compound can 
be described by the method of making i t and that the "primary concern is whether 
the description requirement has been complied with, not the mode selected for 
compliance." 446 The issue in Edwards was whether, on facts of that case, a parent 
application complied with the "written description requirement." 447 The jCAFC in 
Edwards found that, "on the facts of this case, an adequate description of the 



the novelty of those claims, if any, would appear to reside in the fact that 
COX:2:selective inhibitors were previously unknown. The issue of patentability 
under § 102, however, was not decided by the district court, and we do not address 
it further r 



Id 



«» Inre Edwards. 568 F.2d 1349 (C.C.P.A. 1978). 
44 » VbreHerschler. 591 F.2d 693 (C.C.P.A. 1979). 
Rochester II 358 F.3d at 928. 

443 Id {quoting IjueVdv/ar&B, 568 F.2d 1349. 1354 (C.C.P.A. 1978)). 

444 Id. at 928 ("In marked contrast to the Edwards application, the specification of the '850 
patent contains no disclosure of any method for making even a single 'nonsteroidal compound that 
selectively inhibits activity of the PGHS-2 gene product.""). 

Id. 

446 Edwards. 568 F.2d at 1352 ("As the board apparently recognized, the description in the 
parent is not intrinsically defective merely because appellants chose to describe their claimed 
compound by the process of making it; our primary concern is whether the description requirement 
has been complied with, not the mode selected for compliance." (emphasis added) ). 

447 Id. at 1351 ("In the context of the present case, this translates into whether the parent 
application provides adequate direction which reasonably leads persons skilled in the art to the later 
claimed compound .... By the very nature of this inquiry, each case turns on its own specific facts." 
(citations omitted)). 
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aforementioned reactions is, concomitantly, an adequate description of the claimed 

compound." 4 ^ M x 

As applied to the facts of R ochester J£ the " mode selected f or com; 
the written- - desraption-^qukement^nder-^ 

described -in -the- -University- of- Rochester's - -850 Patent: - - l^er-such-an-analysis7-o: 
the facts of Rocheste r II. an adequate description of the selective assay could have 
been found to reasonably lead a person skilled in the art to selective inhibitors and 
"concomitantly," to the claimed therapeutic method for their use. 

With respect to HerschJer, the Rochester II court stated that claims directed to 
concurrent topical administration of a steroidal agent and dimethyl sulfoxide 
(LDMSCT) were supported by a specification that included only one example of a 
"physiologically active steroidal agent." 449 The distinction from HerschJer, according 
to the Rochester II court, was that many steroidal agents were known, unlike 
"non-steroidal compounds that selectively inhibi t activity of the PGH£fc2 gene 
product." 46 ** Again, the Rochester ti court seemed to rely on the belief that 



patentability of the '850 patent claims resided in the novelty of compounds that 
selectively inhibit PGH S:2 activity, 451 As the _CAFC jioted, "were this application 
drawn to novel 'steroidal agents/ a different question would be posed/ £he novelty in 
that invention was the DMSO solvent, not the steroids." 452 

In fact, the novelty in HerschJer was not of DMSO, but the use of DMSO in 
combination with a steroidal agent. 463 The issue in HerschJer was whether adequate 
support for a claim limitation directed to a class of compounds (e^g^ .steroids) was met 
by a specification that identified only one member of the class 
(glucocorticosteroids). 454 The court concluded in the affirmative. 455 Specifically, the 
.court held that, with respect to the claimed method of enhancing penetration by use 
of steroids in combination with DMSO, identification of only one member of the class 
of steroids was, in fact, sufficient: 

Steroids, when considered as drugs, have a broad scope of physiological 
activity. On the other hand, steroids, when considered as a class of 
compounds carried through a layer of skin by DMSO, appear on this record 
to be chemically quite similar. The diversity of exemplified materials 
"potentiated" by DMSO in the greal^grandparent appHcation, is ^ucA 



448 Id. at 1352 (emphasis added) . 
«a Rochester II. 358 F.3d_at_928. 

450 Id. ("Critically, however, there was no question in that case that, unlike 'nonsteroidal 
compounds that selectively inhibit[s] activity of the PGHS;2 gene product,' numerous physiologically 
active steroidal agents were known to those of ordinary skill in the art."). 

™ Id 

«z Id (citations omitted). 

4 m In re Herschler. 591 F.2d 693. 695 (C.C.P.A. Ift79> ("The appellant has found that DMSO 
enhances the p enetration of a number of materials through skin tissue. In the application at hand, 
a mixture of DMSO and a 'physiologically active steroidal agent* is administered to skin (or a 
mucous membrane) with the result that the steroid penetrates the membrane."). 

454 Id. at 696 ("We have carefully considered the great;grandparent case but the only disclosure 
relating to steroids :is limited to glucoxorticosteroids . . . ." (citations omitted)) . 

455 Id. at 701 ("The question is simple' does the array of information supplied by appellant in 
the great;grandparent application teach one having ordinary skill in this art that one of the class of 
steroids will operate in the claimed process. We conclude that it does"). 
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broader than the diversity of steroid compounds shown contemporaneously 
in the art. In this instance, we conclude that one having ordinary skill in 
this art would have found the use of the subgenus of steroids to be apparent 
in the written description of the great^andparent apphcation.^ 



As further stated by the court in Herschlef- 

In sum, claims drawn to the use of known chemical compounds in a 
manner auxiliary to the invention must have a corresponding written 
description only so specific as to lead one having ordinary skill in the art to 
that class of compounds. Occasionally, a functional recitation of those 
known compounds in the specification may be sufficient as that 
description. 467 

Contrary to the position of the Rochester II court^ patentability of the claimed 
method in Herschler did not rely upon novelty of DMSO as a compound, but rather in 
the therapeutic use of such compounds. Further, the court in Herschler, as in other 
cases decided prior to Rochester_Jl did not categorically dismiss functional 
definitions of compounds employed in a claimed method.^ The '850 patent provides 
a description of a class of compounds functionally identifiable, in the case of 
Rochester Ilby an assay, the enablement of which the court did not address. 

After dismissing a plea by amici 459 asserting that the "Court's decision will have 
a significant impact on the continuing viability of technology transfer programs at 
universities and on the equitable allocation of intellectual property rights between 
universities and the private sector," 460 the Rochester H court summarized the failure 
of the '850 patent as not providing "any guidance" to compounds suitable for use in 
the claimed methods: 

In sum, because the '850 patent does not provide any guidance that 
would steer the skilled practitioner toward compounds that can be used to 

carry out the claimed methods^ an essential ^ 

patent ^_ and has not provided evidence that any such compounds were 
otherwise within the knowledge of a person of ordinary skill in the art at 
the relevant time, Rochester has failed to raise anyquestion of material fact 
whether the named inventors disclosed the claimed invention. 461 



Contrary to the £AFC's t statement, the assay described in the '850 patent 
specification was the guidance to a skilled practitioner necessary to identify 
compounds that would be employed in the method claimed. There is no allegation by 
the court that the assay described would not, in fact, identify existing compounds 
that selectively inhibit PGHi^;2 activity, nor : that the amount of e^ 

466 Id (emphasis added) . 

467 Id at 702 (emphasis added) . 
jjg Id. at 701. 

459 Brief of Amici Curiae The Regents of the University of California et al. University of 
Rochester v. G.D. Searle & Co.. 249 F. Sudp. 2d 216 (2003) (No. 04-476). 
**> Rochester II. 358 F.3d 916. 929 (Fed. Cir. 2004) . 

461 Id 
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rendering the specification non^e ! nab ling. _There also is no allegation by the court 
that, if such a compound were identified, the disclosure would not provide an 
adequate written description for its use, as claimed by the '850 patent. 462 

Instead, the .QAP^concluded that, absent cUsclosure _ of a PGH^2 (CQ&2) 
selective compound, or "pre^epdst^g awareness 

COX^2 selective activity/' the '850 jpatent ; dearly ; .and convincingly _ prove s its own 
invalidity: 




Although section 282 of the Patent Act places the burden of proof on 
the party seeking to invalidate a patent, it does not foreclose the possibility 
of that party demonstrating that the patent in suit proves its own 
invalidity, £nd as detailed _ in .section I aboy^^^ conclude that the '850 _ 
patent clearly and convincingly does just that The patent's claims all 
re quire a CO^^selective .compound, but no .CO^^selective compound is 
disclosed in the patent, and it is undisputed that there was no pre^ejdsting 
awareness in the art of any compound having COX^selectiye .activity. 463 
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V.JMPUCATIONS 

Judge Newman, in her dissent from the order by the ^^]C denying : a ..petition 
for rehearing and denying a petition for rehearing en banc the Rochester J decision, 
stated that she fully shared Judge Lourie's understanding of the law, and that "it is 
simply incorrect to say that there is not now and never has been a 'written 
description' requirement in the patent law." 464 She concisely summarized "past 
decisions . . . offered to support the exotic proposition that it is not necessary for the 
inventor to describe the patented invention, but that enablement alone suffices under 
the statute," as "traditional issues of generic disclosures and specific examples, and 
questions of support and predictability for scientific concepts and their 
embodiments." 485 

However, the heart of the issue is not elimination of a requirement that the 
invention be described, but the gauge for measuring compliance with the 
requirement. The danger of a free-standing ..written description require 
exemplified in Judge Rader's dissenting opinion. In his dissent, Judge Rader 
reasserted his position, first announced in his dissent in EnzoJ^ that the modern 
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™ See, e.sr. EPO. JPQ. and USPTO. TRILATERAL PROJECT B3B. MUTUAL UNDERSTANDING IN 
SEARCH AND EXAMINATION' REPORT ON COMPARATIVE STUDY ON BIOTECHNOLOGY PATENT 
PRACTICES (Nov. 2001 ). This article does not address general discussions of reach: through claimsj. 

« Rochester II 358 F.3d at 930 {emphasis added) (citations omitted) (The relevant portion of 
35 U.S.C. § 282 referenced is: "The burden of establishing invalidity of a patent or any claim thereof 
shall rest on the party asserting such invalidity"). 

«« Rochester I. 375 F.3d 1303, 1307 (Fed. Cir, 2004) (Ne^nan, J„ dissenting . 

**Id 

« Id at 1311 (Rader, -T., dissenting). 
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"written" description" req uirement was first established in 1967, and only to "police 
priority " w < 
According to Judge Rader, the£AFC in JJlly established, without legal basis, a 



new doctrine (the "Eli Lilly dextrine")- -that -raquij^ 
"adequate support" for the claims: 

T 

Ir^ simple terms, contrary, .to logic and the statute itself EH Lilly requires 

*one.part of the apfirifinntion (the written description) to provide "adequate 

support" for another part of the specification (the claims). Neither Eli Lilly 
nor this case has explained either the legal basis for this new validity 
requirement or the standard for "adequate support." Because this new 
judge^made doctrine ha s created enormous confusion which this court 
declines to resolve, I respectfully dissent. 468 



Inserted: Rochester, 



□ 



Deleted: ... (" 



I ... T4741 



Formatted: Font: Century, 8.5 pt 



Deleted: «J 
1 
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Eli Lilly in 1997 applied the written 
description doctrine for this 
important purpose [to ensure that 
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Judge Rader did not deny the existence of a written description requirement, but 
restricted its proper application to determinations of priority. 469 Regardless of its 
intended purpose, the question raised by Judge Rader of the standard for "adequate 
support" under the written description requirement remains. 

A hypothetical relied upon in the majority opinion of Rochester^! was employed 
by Judge Rader to explain why extension of the written description requirement 
beyond policing priority is "both superfluous and dangerous." 470 Judge Radar 
explained that JRocheste^xeieTS to a situation where a patent can enable an invention 
that is not described by the specification. In the words of the opinion, "[sluch can 
occur when enablement of a closely related invention A that is both described and 
enabled would similarly enable an invention B if B were described." 471 _As described, 
in Section IIT.C.l . this hypothetical was also presented in DiLeone* 72 

Judge Rader asserted that such a hypothetical "rarely, if ever, happens^" 47 ^ In 
support of this position, he employed an analogy of an invention that "solves a 
problem that enables those of ordinary skill in the art to know how to make and use 
both a radio and a TV." 474 According to the hypothetical, the inventor describes only p 
a radio but broadly claims an "electrical receiver." 4 ^ Judge Rader raised two issues |\\| : M Deleted: 
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468 Id. at 1307—08 (Una J., dissenting) . 

459 Id. at 1311 f Rader. J., dissenting). In discussing the- written description requirement. Judge 
Rader stated : "In Fact, everv application of the written description doctrine before Eli Lilly in 1997 
applied the written description doctrine for this important purpose and only for this important 
purpose." Id. The "important purpose" that Judge Rader referred to *ia to ensure that the inventor 
had possession, as of the filing date of the application relied on. of the specific subject matter later 
claimed bv him.' 1 Id. 

470 M at 1312. 

471 Id. (emphasis added). 

472 See supra Section 1 1 J.C.I . 

473 Rochester I 375 F.3d at 1312... Judge Rader stated: 

In the first place, the hypothetical rarely, if ever, happens. No actual case 
presents the hypothetical In both Eli Lilly and Rochester, for instance, the 
invention A (rat insulin in Eli Lilly, an assay for Cox 1 and 2 in Rochester) was 
enabled and described, but the invention B (human insulin in Eli Lilly, a Cox 2 
inhibitor in Rochester) was not enable d A 
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associated with this hVpb"th~e^c~afr^ 

enable o^, the inventor woul d have disclosed and claimed it, even if pnly in a separate V . \ \ 
application,,^ that for this "very practical reason, no case has ever presented the V 
hypothetical." 476 Second, it was his position that, "if the radio inventor for-some-4^;. 
unfathomable reason does not grasp that he has enabled a TV and later asserts the Iwja^ 
radio patent against a TV maker," the "court would properly interpret the claim ^s \l\\\ 
limited to the radio." 477 According to Judge Rader, "the Eli Lilly doctrine would 
instead invalidate the radio patent." 478 

Leaving aside the issues of whether no case has ever presented the hypothetical, 
as alleged, and whether the written description requirement is limited to 
determinations of priority, the re is a difference in application of the written 
description requirement between the "Eli Lilly doctrine" and Judge Rader's 
resolution of the hypothetical . The difference is that, while under the Eli Lilly 
doctrine 1 the patent would be invalidated for lack of written description of an 
"electrical receiver," Judge Rader would hold that the claim, properly interpreted, 
would be limited to the radio. This analysis, however, begs the question presented by 
Judge Rader of "adequate support." In other words, how much support is required 
for a claim to an "electrical receiver" so that it will properly encompass the 
embodiment of television? Without reference to enablement of one skilled in the art, 
whether sufficiency of written description is posed as an issue of priority or validity, 
a question of how much disclosure is sufficient to demonstrate to one skilled in the 
art that the inventor was in "possession" of the invention, or to "understand what is 
claimed and to recognize that the inventor invented what is claimed," remains. To 
simply answer that television must be described, as suggested by Judge Rader, 
would, in effect, reduce the specification to a claim, and make the presence of claims, 
as such, superfluous. 

With respect to the occurrence of any case represented by the hypothetical, a 
case that closely, if not exactly, parallels the hypothetical ±s_Smyth e* 7 % In this case, _ 
claims reciting use of an "inert fluid immiscible with said liquid samples," 481 was held 
by Judge Rich to meet the written description requirement of the first paragraph of y 
35 U.S.C. § 11 % despite the fact that the specification and original claims taught only / , 
"air or other gas which is inert to the liquid." 482 The solicitor rejected the claims / 
under the written description requirement of the first paragraph of 35 U.S.C. S 112^/ 
on the basis that they were broader than the specification because "fluid" embraced 
both 'liquid" and "gas." 483 Judge Rich reversed the decision, stating: "We cannot 




Deleted: ...(" 



Formatted: Block Text (FN), Left, 
Indent: First line: 0", Une spadng: 
single 



Deleted: 
A f Inserted: 



, T4911 



Deleted: § . 



\\A\ | Inserted: 
\ v\ { DeletedT" 
^[inserted: ' 



•'I ... T4931 



, T4941 



if 

1 



-T4951 



Deleted: 



*D/ n -). 



Inserted: 



"0-"-' L..r496i: 



Deleted: to be 



^ Deleted: in discussed 
// supra. 4 * 0 . . . , first paragraph, .... first 
paragraph, [ ... f4971 



*™ Id. 
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™Id. 

™ In re Smvthe. 480 F.2d 1376 . 1384 (C.C.P.A. 1973) ; see also supra Section IHC. 1. 
481 Smvthe, 480 F.2d at 1378. 
<** Id t at 1377. 

483 Id. at 1382 . The court set out that : 

The solicitor, explaining the basis of this rejection on the facts of this case, takes 
the position thatj|where the description of the invention is narrower than the 
scope of protection sought by the claims Jlthe claims may be rejected under 
Section 112. paragraph 1, even though the term Ifluid^ embraces both Illiquid^ and 
lgas'1 11 and even though itJlwould not encompass undue experimentation to arrive 
at a satisfactory method and structure to employ liquid and gases other than airj 
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agree with the broad proposition, apparent in the above quoted language, that in 
every case where the description of the invention in the specification is narrower 
than that in the claim there has been a failure tn fulfill t.hp description requirement 
in 35U.S.C.& 112."w 



The reasoning applied by Judge Rich stemmed from-pretetabihiy-in-^te-art-- 
"This is not a case where there is any unpredictability such that appellants' 
description' "of air or oth e/ inert gas would not convey to one skilled in the art" 
knowledge that appellants invented an analysis system with a fluid segmentizing 
medium." 485 

Judge Rich noted that the issue could have been addressed under the 
"enablement" portion of the first paragraph of 35 U.S.C. § 11% ' Itjhe board may have 
also treated the rejection of these claims under 35 U.S.C. § 112 under the 
jenablementLsection of the first paragraph, but the solicitor has narrowed the 
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rejection by his argument to the description 

Presumably, the patent could successfully be enforced against embodiments 
employing a liquid as the "inert fluid." Therefore, regardless of whether a "written 
description requirement" is employed to police priority, as arguably would be the case 
in Smythe, there is precedent for embracing certain embodiments within broad 
terminology that is supported literally in the specification only by different 
embodiments. Judge Rich, in reversing the solicitor, held that one skilled in the art 
would understand from the specification the scope of the invention as later claimed, 
given the predictability in the art.^fil Judge Rader, according to his analysis of the 
hypothetical of a claimed electrical receiver embodied by a radio, apparently would 
have limited enforcement of the claims in Smythe to the literal description of the 
specification, without consideration of enablement by one skilled in the art^f 8 . 

The analogy employed by Judge Rader is not limited to cases that predate jLilly. 
For example, Chiron fiorp. v. Genentech Inc* which was decided in March 2004, prior 
to Judge Rader's dissent of July 2004, was directed to monoclonal antibodies that 
bind to human breast cancer antigen. 490 The language of the claims of the issued 
patent appeared in one or more of four earlier applications to which the patent 
claimed priority. 491 The district court, prior to trial, construed the claims of the 



<85 Id. at 1383. 
«* Id. at 1382JL2. 
Id. at 1383. 
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488 Rochester I, 375 F.3d 130?, 1312 (Fed. Cir. 2004> ^Rader, J„ dissenting). 
<» Chiron Corp. v. Genentech |nc, 363 F.3d 1247, 1250 (Fed. Cir. 3004) . 

-* 91 Id. at 125Cb:52. The claims at issue were directed to monoclonal antibodies that bind to a 
human breast cancer antigen, also known as c;erbB:2, or HER2, Id. at 1250. The antibodies are 
identified in claims 1 and 9 as "monoclonal antibody 454C11 which is produced by the hybridoma 
deposited with the American Type Culture Collection having Accession No. HB 8484," and 
"monoclonal antibody 520C9 which is produced by the hybridoma deposited with the American Type 
Culture having Accession No. HB 8656," respectively. Id. Claim 19 only identifies the monoclonal 
antibody as binding to human c:erbB:2 antigen. Id. Monoclonal antibody 454C11 and the 
hybridoma were disclosed in the first application filed in 1984. Id. at 1251. Monoclonal antibodies 
454C11 and 520C9, along with their respective hybridomas were disclosed in a 1985 
continuation:inj)art (CIP) application. Id. A 1986 CIP includes these and additional murine 
antibodies and hybridomas. Id. All of the antibodies disclosed in the 1984, 1985 and 1986 
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patent to cover murine, chimeric* and humanized antibodies that bind to c^erbEfc.2 
(l r HER2^) antige n. 4 ? 2 The' firs t^iHed ap plication, although including all of the" 
language of independent claims 1 and 19 of the issued patent, did not teach «himeritf^\ 
or humanized antibodies as possible embodiments. 493 -The -—subsequent 
continuatio Arinjart C;CIF) applications preceding the specification of the issued 
patent made reference to monoclonal antibodies as not "limited as regards the source \\\y$ 
of the antibody or the manner in which it i s made," but did not specifically identify • Jffft 
chimeric .or. humanized . antibodies . as. .possible _ embodimexits. 4 .? 4 . . . .The. fiAFC. stated Y >'$v 
4hat, Mwder In r&Hogan } ABb the firs fcfile d application did not nee d to enable -chi 
or humanized antibodies because -such* technology -was- not known-afcthe -time; 496 - 
CIP applications preceding the issued patent, on the other hand, were held t 
no menabling because, by the time they were filed, chimeric and humanized antibody 
technology was in existence (although "nascent") and, therefore, the specification 

required disclosure and enablement of those embodiments, 49 - 7 - 

Conversely, with respect to the first^filed . appUcatio^ 
issued patent could not claim priority to that application because, as interpreted by 
the district court, the issued patent claims were broad enough to embrace chimeric 
and humanized antibodies and this technology did not exist at the time of the 
firs fcfiled application. 4 ^ Therefore, accordin g to the T CAFC. the inventors could not 
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be in "possession" of the invention and, as a consequence, the specification failed to 
meet the written description requirement as applied to the issued claims. Therefore, 
the '561 patent was not entitled to the filing date of the firs fcfiled application, despite v 
the presence of the literal language of the claims in that firs fcfiled application, and \ \ 
despite the fact that the firs fcfiled applic ation w as held to be enabling under the first ^ \V/ 
paragraph of 35 U.S.C. S 
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applications were murine! none were chimeric or humanized. Id. The CIP that became the issued 
patent, U.S. Patent No. 6.054.561 (issued April 25, 2000), was filed on June 7, 1995. Id 

492 Id at 1252_CBefore trial, the district court broadly construed the claims of the '561 patent 
to embrace chimeric and humanized antibodies in addition to murine antibodies that bind HER2/ 1 ). 
■iw Id at 1251. 
<" Id. at 125j^52. 

In re Hogan. 559 F.2d 595 (C.C.P.A. 1977). 
* x Chiron Corp.. 363 F.3d at 1254 ("Because the first publication documenting the successful 
creation of chimeric antibodies occurred after the 1984 application, this sequence of ewnts shows 
that this new technology arose after the filing date and thus was, by definition, outside the bounds 
of the enablement requirement. •* (citations omitted! ). 
497 Id. at 1256--57 . The court- noted: 

Substantial evidence, however, supports the jury*s implicit finding that the 
technology was still nascent at the time of the 1986 application (as well as, of 
course, at the time of the 1985 application) and thus would have still required 
undue experimentation . . . .^Accordingly, the record amply supports the jury's 
conclusion that the 1985 and 1986 applications do not enable the claims of the 
'561 patent without undue experimentation. 



Deleted: -...Federal Circuit... - 

first paragraph [ mmm [SIS} 



*»* hi at 1257. 

**> Id. at 1255 . The court stated: 

In this case, the Chiron scientists, by definition, could not have possession o£ and 
disclose, the subject matter of chimeric antibodies that did not even exist at the 
time of the 1984 application. Thus, axiomatically, Chiron cannot satisfy the 
written description requirement for new matter appearing in the '561 patent, 
namely chimeric antibodies. 
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The generic term, "monoclonal antibody," was limited in Chiron under a written 
description, requirement, .to. embodiments .explicitly, listed, in. the. .specification M . .The. , 
scope of "possession" precluded subsequent improvements in technology within the 
Hteral language of subsequently issued- - claims^- 1 --- - According- -to - Judge - -Rader-s-^ 
re asoning- in- his -dissent -in - Rochester JI,- a- court- -would- hold -that- any -claims - of -the- 
^61"--pat^"-issuifi^--i^ 

continuation^ii^paarf ~ " 'also \ would " ^ " consfrued" as " not " enrompassmg " cHifneric " anffl 
humanized antibodies, because such technology did not exist at the time of filing the 
first^filed application. However, to do so effectively ; confines the scope of claims to the 
language of the specification or, in the alternative, causes any issued patent to fail to 
meet the written description requirement in the face of technological advancements 
since such claims would embrace embodiments that, by definition, could not be 
described in the specification. 

VI . .Conclusion 

There has always been a requirement under United States patent law that 
inventors provide a written description of their invention. The various patent ,Acts 
since 1790, while including the phrase "a specification in writing containing a 
description" (Patent Act of 1790) or the phrase "written description" (Patent Acts of 
1793, 1836 and 1952), have differed in the components of the requirement. Legal 
precedent recognizes the existence of the statutory written description requirement^ 
and analysis of representative case law strongly suggests that comprehension of the 
scope of the invention claimed by one skilled in the art underlies the requirement. 
The purpose is^ ither _j;o notify, the public as a warning against infringe ment, or ,£q 
put; the jpublic in jpossession of the inve ntion as jp art of the quid pro quo of obtaining a 
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limited term of exclusivity . 

The concept of possession by the public extends back to cases decided under the 
Patent Act of 1793. The test for possession was enablement and understanding by 
one skilled in the art of the "principle" or "mode of operation" of the invention, given 
the language of the specification filed, either to distinguish the invention from 
subject matter previously known or to define the exclusive rights of an inventor M 
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w » Id at 1258, 

Thus, the '561 patent defined "monoclonal antibody^ to include chimeric and 
humanized antibodies. Still only a portion of this updated meaning of 'monoclonal 
antibody' can claim priority to the earliest application. If required to engage in 
claim construction, therefore, this court would face a dilemma: Either construe 
the term according to the meaning of the earliest application but contrary to the 
explicit definition in the '561 patent or construe the term according to the explicit 
definition in the '561 patent but broader than the disclosure of the earliest 
application. Again, the latter alternative would run afoul of the prohibition 
against importing new matter into later patent documents. As noted, however, 
the record amply supports the jury's verdict of invalidity without reaching this 
complex claim construction question. 

Id 

Patent Act, of 1793. ch. 11, 1 Stat. 318 323 (February 21. 1793) (current version at 35 U.S.C. 
S 112(2000). 
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These guiding principles have proved remarkably adept at accommodating 

extraordinary advancements in technology for the last two^undred years. So lofi& j|gL ----" { Deleted; ] 
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